DeepSeek-V2-Lite-Chat REAP Pruned ratio 0.3

This is a REAP-pruned checkpoint derived from deepseek-ai/DeepSeek-V2-Lite-Chat.

The model was pruned with routed-expert REAP pruning using 1024 calibration samples at sequence length 2048. Router weights were renormalized after pruning.

Calibration used the paper-style mixture from:

  • theblackcat102/evol-codealpaca-v1
  • Salesforce/xlam-function-calling-60k
  • open-r1/Mixture-of-Thoughts
  • SWE-bench/SWE-smith-trajectories

Summary

  • Pruning method: reap
  • Requested pruning ratio: 0.30
  • Actual routed-expert pruning ratio: 19 / 64 = 0.296875
  • Routed experts per MoE layer: 64 -> 45
  • Active routed experts per token: 6
  • Shared experts: preserved
  • Seed: 42

The checkpoint was validated by reloading it as DeepseekV2ForCausalLM with native Transformers DeepSeek-V2 support and running a forward-pass smoke test.

Downloads last month
4
Safetensors
Model size
11B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for RangerX/DeepSeek-V2-Lite-Chat-REAP-Pruned-ratio-0.3

Finetuned
(6)
this model