Sparse Autoencoders for Qwen 3.5-35B-A3B

9 TopK sparse autoencoders (SAEs) trained on residual stream activations of Qwen/Qwen3.5-35B-A3B, a Mixture-of-Experts model (35B total / 3B active) with a hybrid GatedDeltaNet + Attention architecture.

Training pipeline: zactheaipm/qwenscope

SAE Overview

SAE ID	Layer	Type	Block	Position	Dict Size	TopK	LR
`sae_delta_early`	6	DeltaNet	1	2	8,192	128	3e-5
`sae_attn_early`	7	Attention	1	3	8,192	128	3e-5
`sae_delta_earlymid`	14	DeltaNet	3	2	16,384	96	2e-5
`sae_attn_earlymid`	15	Attention	3	3	16,384	96	2e-5
`sae_delta_mid_pos1`	21	DeltaNet	5	1	16,384	64	1e-5
`sae_delta_mid`	22	DeltaNet	5	2	16,384	64	1e-5
`sae_attn_mid`	23	Attention	5	3	16,384	64	1e-5
`sae_delta_late`	34	DeltaNet	8	2	16,384	64	8e-6
`sae_attn_late`	35	Attention	8	3	16,384	64	8e-6

Architecture

Qwen 3.5-35B-A3B uses a repeating 4-layer block pattern: 3 GatedDeltaNet layers + 1 standard attention layer. The model has 40 layers total (10 blocks). SAEs are trained at 4 depth levels (early, early-mid, mid, late), covering both DeltaNet and attention sublayers at each depth.

Hidden dimension: 2,048
MoE config: 256 experts, 8 routed + 1 shared per token
SAE type: TopK with auxiliary dead-feature loss (aux-k resampling)

Training Details

Training tokens: 200M per SAE
Batch size: 4,096 tokens
Warmup: 1,000 steps
Data: HuggingFaceH4/ultrachat_200k + allenai/WildChat-1M (GDPR-filtered)
Includes tool-use data: Yes — see zactheaipm/agent-tool-use-synthetic
Methodology: FAST (sequential processing)
Max sequence length: 2,048
Resampling: Every 5,000 steps (dead feature revival)
Checkpointing: Every 50M tokens

Learning rates are scaled inversely with activation norm at each depth (3e-5 early → 8e-6 late).

File Structure

sae_{type}_{depth}/
├── config.json            # {hidden_dim, dict_size, k}
└── weights.safetensors    # Encoder + decoder weights

Usage

import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download

repo_id = "zactheaipm/qwen35-a3b-saes"
sae_id = "sae_attn_mid"

# Download
config_path = hf_hub_download(repo_id, f"{sae_id}/config.json")
weights_path = hf_hub_download(repo_id, f"{sae_id}/weights.safetensors")

# Load
with open(config_path) as f:
    config = json.load(f)
weights = load_file(weights_path)

print(config)        # {'hidden_dim': 2048, 'dict_size': 16384, 'k': 64}
print(weights.keys())  # dict_keys with encoder/decoder matrices

Citation

If you use these SAEs in your research, please cite:

@misc{qwen35_a3b_saes_2026,
  title={Sparse Autoencoders for Qwen 3.5-35B-A3B},
  author={Zac Yap},
  year={2026},
  url={https://huggingface.co/zactheaipm/qwen35-a3b-saes}
}

License

MIT

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zactheaipm/qwen35-a3b-saes

Base model

Qwen/Qwen3.5-35B-A3B-Base

Finetuned

Qwen/Qwen3.5-35B-A3B

Finetuned

(127)

this model