---
base_model:
- Qwen/Qwen3-Next-80B-A3B-Instruct
tags:
- text-generation-inference
license: apache-2.0
---


![qwen3-next-instruction](https://cdn-uploads.huggingface.co/production/uploads/68121d80da035a609e569a81/Ft9cmZlll_PehtFYkESxH.png)

**Qwen3-Next-REAP-40B-A3B-Instruct** has the following specifications:

- **Type:** Causal Language Models
- **Number of Parameters**: 40B in total and 3B activated
- **Hidden Dimension**: 2048
- **Number of Layers**: 48
- **Hybrid Layout**: 12 * (3 * (Gated DeltaNet -> MoE) -> 1 * (Gated Attention -> MoE))
- **Gated Attention**:
- **Number of Attention Heads**: 16 for Q and 2 for KV
- **Head Dimension**: 256
- **Rotary Position Embedding Dimension**: 64
- **Gated DeltaNet**:  
  **Number of Linear Attention Heads: 32 for V and 16 for QK  
  **Head Dimension: 128
- **Mixture of Experts**:
- **Number of Experts: 256 (uniformly pruned from 512)
- **Number of Activated Experts: 10
- **Number of Shared Experts: 1
- **Context Length**: 262,144 natively and extensible up to 1,010,000 tokens
- **Compression Method**: REAP (Router-weighted Expert Activation Pruning)
- **Compression Ratio**: 50% expert pruning