BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
Paper • 2402.10631 • Published • 2
| PPL | arc_easy | arc_challenge | piqa | winogrande | hellaswag | mmlu | QA Avg |
|---|---|---|---|---|---|---|---|
| 17.11 | 37.84 ± 1.00 | 21.59 ± 1.20 | 62.46 ± 1.13 | 52.41 ± 1.40 | 33.39 ± 0.47 | - | 41.54 |
Training method based on BitDistiller Paper
Base model
TinyLlama/TinyLlama_v1.1