Sparse Autoencoders for Qwen 3.5-35B-A3B

9 TopK sparse autoencoders (SAEs) trained on residual stream activations of Qwen/Qwen3.5-35B-A3B, a Mixture-of-Experts model (35B total / 3B active) with a hybrid GatedDeltaNet + Attention architecture.

Training pipeline: zactheaipm/qwenscope

SAE Overview

SAE ID Layer Type Block Position Dict Size TopK LR
sae_delta_early 6 DeltaNet 1 2 8,192 128 3e-5
sae_attn_early 7 Attention 1 3 8,192 128 3e-5
sae_delta_earlymid 14 DeltaNet 3 2 16,384 96 2e-5
sae_attn_earlymid 15 Attention 3 3 16,384 96 2e-5
sae_delta_mid_pos1 21 DeltaNet 5 1 16,384 64 1e-5
sae_delta_mid 22 DeltaNet 5 2 16,384 64 1e-5
sae_attn_mid 23 Attention 5 3 16,384 64 1e-5
sae_delta_late 34 DeltaNet 8 2 16,384 64 8e-6
sae_attn_late 35 Attention 8 3 16,384 64 8e-6

Architecture

Qwen 3.5-35B-A3B uses a repeating 4-layer block pattern: 3 GatedDeltaNet layers + 1 standard attention layer. The model has 40 layers total (10 blocks). SAEs are trained at 4 depth levels (early, early-mid, mid, late), covering both DeltaNet and attention sublayers at each depth.

  • Hidden dimension: 2,048
  • MoE config: 256 experts, 8 routed + 1 shared per token
  • SAE type: TopK with auxiliary dead-feature loss (aux-k resampling)

Training Details

  • Training tokens: 200M per SAE
  • Batch size: 4,096 tokens
  • Warmup: 1,000 steps
  • Data: HuggingFaceH4/ultrachat_200k + allenai/WildChat-1M (GDPR-filtered)
  • Includes tool-use data: Yes β€” see zactheaipm/agent-tool-use-synthetic
  • Methodology: FAST (sequential processing)
  • Max sequence length: 2,048
  • Resampling: Every 5,000 steps (dead feature revival)
  • Checkpointing: Every 50M tokens

Learning rates are scaled inversely with activation norm at each depth (3e-5 early β†’ 8e-6 late).

File Structure

sae_{type}_{depth}/
β”œβ”€β”€ config.json            # {hidden_dim, dict_size, k}
└── weights.safetensors    # Encoder + decoder weights

Usage

import json
import torch
from safetensors.torch import load_file
from huggingface_hub import hf_hub_download

repo_id = "zactheaipm/qwen35-a3b-saes"
sae_id = "sae_attn_mid"

# Download
config_path = hf_hub_download(repo_id, f"{sae_id}/config.json")
weights_path = hf_hub_download(repo_id, f"{sae_id}/weights.safetensors")

# Load
with open(config_path) as f:
    config = json.load(f)
weights = load_file(weights_path)

print(config)        # {'hidden_dim': 2048, 'dict_size': 16384, 'k': 64}
print(weights.keys())  # dict_keys with encoder/decoder matrices

Citation

If you use these SAEs in your research, please cite:

@misc{qwen35_a3b_saes_2026,
  title={Sparse Autoencoders for Qwen 3.5-35B-A3B},
  author={Zac Yap},
  year={2026},
  url={https://huggingface.co/zactheaipm/qwen35-a3b-saes}
}

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for zactheaipm/qwen35-a3b-saes

Finetuned
(127)
this model