Sparse Autoencoders for Qwen 3.5-35B-A3B
9 TopK sparse autoencoders (SAEs) trained on residual stream activations of Qwen/Qwen3.5-35B-A3B, a Mixture-of-Experts model (35B total / 3B active) with a hybrid GatedDeltaNet + Attention architecture.
Training pipeline: zactheaipm/qwenscope
SAE Overview
| SAE ID | Layer | Type | Block | Position | Dict Size | TopK | LR |
|---|---|---|---|---|---|---|---|
sae_delta_early |
6 | DeltaNet | 1 | 2 | 8,192 | 128 | 3e-5 |
sae_attn_early |
7 | Attention | 1 | 3 | 8,192 | 128 | 3e-5 |
sae_delta_earlymid |
14 | DeltaNet | 3 | 2 | 16,384 | 96 | 2e-5 |
sae_attn_earlymid |
15 | Attention | 3 | 3 | 16,384 | 96 | 2e-5 |
sae_delta_mid_pos1 |
21 | DeltaNet | 5 | 1 | 16,384 | 64 | 1e-5 |
sae_delta_mid |
22 | DeltaNet | 5 | 2 | 16,384 | 64 | 1e-5 |
sae_attn_mid |
23 | Attention | 5 | 3 | 16,384 | 64 | 1e-5 |
sae_delta_late |
34 | DeltaNet | 8 | 2 | 16,384 | 64 | 8e-6 |
sae_attn_late |
35 | Attention | 8 | 3 | 16,384 | 64 | 8e-6 |
Architecture
Qwen 3.5-35B-A3B uses a repeating 4-layer block pattern: 3 GatedDeltaNet layers + 1 standard attention layer. The model has 40 layers total (10 blocks). SAEs are trained at 4 depth levels (early, early-mid, mid, late), covering both DeltaNet and attention sublayers at each depth.
- Hidden dimension: 2,048
- MoE config: 256 experts, 8 routed + 1 shared per token
- SAE type: TopK with auxiliary dead-feature loss (aux-k resampling)
Training Details
- Training tokens: 200M per SAE
- Batch size: 4,096 tokens
- Warmup: 1,000 steps
- Data: HuggingFaceH4/ultrachat_200k + allenai/WildChat-1M (GDPR-filtered)
- Includes tool-use data: Yes β see zactheaipm/agent-tool-use-synthetic
- Methodology: FAST (sequential processing)
- Max sequence length: 2,048
- Resampling: Every 5,000 steps (dead feature revival)
- Checkpointing: Every 50M tokens
Learning rates are scaled inversely with activation norm at each depth (3e-5 early β 8e-6 late).
File Structure
sae_{type}_{depth}/
βββ config.json # {hidden_dim, dict_size, k}
βββ weights.safetensors # Encoder + decoder weights
Usage
import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download
repo_id = "zactheaipm/qwen35-a3b-saes"
sae_id = "sae_attn_mid"
# Download
config_path = hf_hub_download(repo_id, f"{sae_id}/config.json")
weights_path = hf_hub_download(repo_id, f"{sae_id}/weights.safetensors")
# Load
with open(config_path) as f:
config = json.load(f)
weights = load_file(weights_path)
print(config) # {'hidden_dim': 2048, 'dict_size': 16384, 'k': 64}
print(weights.keys()) # dict_keys with encoder/decoder matrices
Citation
If you use these SAEs in your research, please cite:
@misc{qwen35_a3b_saes_2026,
title={Sparse Autoencoders for Qwen 3.5-35B-A3B},
author={Zac Yap},
year={2026},
url={https://huggingface.co/zactheaipm/qwen35-a3b-saes}
}
License
MIT
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support