Qwen3.6-35B-REAP Pruned ratio 0.2

This is a REAP-pruned checkpoint derived from Qwen/Qwen3.6-35B-A3B. The pruning ratio is 0.20.

Pruning

The model was pruned with REAP routed-expert pruning. Expert saliency was computed from router weights and expert activation norms on a 1024-sample calibration set with sequence length 2048. Router weights were renormalized after pruning.

Calibration data used the paper-style composite mixture:

theblackcat102/evol-codealpaca-v1
Salesforce/xlam-function-calling-60k
open-r1/Mixture-of-Thoughts[code]
open-r1/Mixture-of-Thoughts[math]
open-r1/Mixture-of-Thoughts[science]
SWE-bench/SWE-smith-trajectories(tool)

The checkpoint keeps the shared expert path unchanged. The routed MoE layers keep 205 experts per layer and num_experts_per_tok=8.

REAP integration notes

Qwen3.5/Qwen3.6 use a packed MoE layout, so the REAP pipeline was extended with architecture-specific adapters for locating MoE modules, collecting packed-expert activation metrics, slicing routed expert tensors and router rows, and saving reloadable Hugging Face checkpoints while preserving tokenizer and processor files.

Details

Base model: Qwen/Qwen3.6-35B-A3B
Pruning method: reap
Pruning ratio: 0.20
Calibration samples: 1024
Calibration sequence length: 2048
Seed: 42
Router renormalization: true
Local checkpoint size before upload: 54G

Use the model with trust_remote_code=True.

Downloads last month: 40

Safetensors

Model size

29B params

Tensor type

BF16

Model tree for RangerX/Qwen3.6-35B-REAP-Pruned-ratio-0.2

Base model

Qwen/Qwen3.6-35B-A3B

Finetuned

(154)

this model