Commit ·
de7cc52
1
Parent(s): fa66c52
Add pipeline tag and project links (#1)
Browse files- Add pipeline tag and project links (26276c9ac00dfd238408f2492e4b449cb2240307)
Co-authored-by: Niels Rogge <nielsr@users.noreply.huggingface.co>
README.md
CHANGED
|
@@ -1,26 +1,32 @@
|
|
| 1 |
---
|
| 2 |
-
license: mit
|
| 3 |
base_model:
|
| 4 |
- z-lab/Qwen3-4B-DFlash-b16
|
|
|
|
|
|
|
| 5 |
---
|
|
|
|
| 6 |
# Qwen3-4B-Ins-Draft-OPD
|
| 7 |
|
| 8 |
This repository contains **Qwen3-4B-Ins-Draft-OPD**, a draft model for speculative decoding.
|
| 9 |
|
| 10 |
-
The model is post-trained from [`z-lab/Qwen3-4B-DFlash-b16`](https://huggingface.co/z-lab/Qwen3-4B-DFlash-b16). It keeps the overall architecture and inference interface consistent with the original DFlash draft model, while further adapting the draft model through
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
## Model Details
|
| 13 |
|
| 14 |
- **Base draft model:** [`Qwen3-4B(enable_thinking=true)`](https://huggingface.co/Qwen/Qwen3-4B)
|
| 15 |
- **Model type:** Draft model for speculative decoding
|
| 16 |
- **Architecture:** Same as the original DFlash draft model
|
| 17 |
-
- **Post-training method:** Draft-OPD
|
| 18 |
|
| 19 |
-
##
|
| 20 |
|
| 21 |
-
|
| 22 |
|
| 23 |
-
|
| 24 |
|
| 25 |
## Citation
|
| 26 |
|
|
@@ -35,4 +41,5 @@ If you find our work useful, please consider citing our paper:
|
|
| 35 |
archivePrefix={arXiv},
|
| 36 |
primaryClass={cs.CL},
|
| 37 |
url={https://arxiv.org/abs/2605.29343},
|
| 38 |
-
}
|
|
|
|
|
|
| 1 |
---
|
|
|
|
| 2 |
base_model:
|
| 3 |
- z-lab/Qwen3-4B-DFlash-b16
|
| 4 |
+
license: mit
|
| 5 |
+
pipeline_tag: text-generation
|
| 6 |
---
|
| 7 |
+
|
| 8 |
# Qwen3-4B-Ins-Draft-OPD
|
| 9 |
|
| 10 |
This repository contains **Qwen3-4B-Ins-Draft-OPD**, a draft model for speculative decoding.
|
| 11 |
|
| 12 |
+
The model is post-trained from [`z-lab/Qwen3-4B-DFlash-b16`](https://huggingface.co/z-lab/Qwen3-4B-DFlash-b16). It keeps the overall architecture and inference interface consistent with the original DFlash draft model, while further adapting the draft model through the Draft-OPD post-training method.
|
| 13 |
+
|
| 14 |
+
- **Paper:** [Draft-OPD: On-Policy Distillation for Speculative Draft Models](https://huggingface.co/papers/2605.29343)
|
| 15 |
+
- **Project Page:** [https://www.haodilei.top/draft-opd/](https://www.haodilei.top/draft-opd/)
|
| 16 |
+
- **Code:** [https://github.com/bingyang-lei/Draft-OPD](https://github.com/bingyang-lei/Draft-OPD)
|
| 17 |
|
| 18 |
## Model Details
|
| 19 |
|
| 20 |
- **Base draft model:** [`Qwen3-4B(enable_thinking=true)`](https://huggingface.co/Qwen/Qwen3-4B)
|
| 21 |
- **Model type:** Draft model for speculative decoding
|
| 22 |
- **Architecture:** Same as the original DFlash draft model
|
| 23 |
+
- **Post-training method:** Draft-OPD (On-Policy Distillation)
|
| 24 |
|
| 25 |
+
## Method Summary
|
| 26 |
|
| 27 |
+
Draft-OPD trains speculative draft models with on-policy target feedback. Instead of only learning from fixed target-generated trajectories (SFT), the drafter is supervised on draft-induced states exposed during speculative verification. This allows the drafter to learn from target feedback on both accepted and rejected proposals, focusing training on the draft-induced errors that limit speculative acceptance.
|
| 28 |
|
| 29 |
+
For detailed training procedures, evaluation settings, and performance results, please refer to the paper.
|
| 30 |
|
| 31 |
## Citation
|
| 32 |
|
|
|
|
| 41 |
archivePrefix={arXiv},
|
| 42 |
primaryClass={cs.CL},
|
| 43 |
url={https://arxiv.org/abs/2605.29343},
|
| 44 |
+
}
|
| 45 |
+
```
|