--- license: mit pipeline_tag: text-generation base_model_relation: quantized library_name: gguf tags: - gguf - qwen3.6 - reap --- > [!TIP] > **[Support this work →](https://donate.sybilsolutions.ai)** · [X](https://x.com/0xsero) · [GitHub](https://github.com/0xsero) · [REAP paper](https://arxiv.org/abs/2510.13999) · [Cerebras REAP](https://huggingface.co/collections/cerebras/cerebras-reap) # Qwen3.6-28B-GGUF GGUF quantization of the base model. ## At a glance | | | |---|---| | Base model | — | | Format | GGUF | | Total params | **28B** | | Active / token | 3B | | Experts / layer | — | | Layers | — | | Hidden size | — | | Context | — | | On-disk size | 147 GB | ## Which variant should I pick? | Variant | Format | Link | |---|---|---| | `Qwen3.6-28B` | BF16 | [link](https://huggingface.co/0xSero/Qwen3.6-28B) | | `Qwen3.6-28B-GGUF` **(this)** | GGUF | [link](https://huggingface.co/0xSero/Qwen3.6-28B-GGUF) | | `Qwen3.6-35B-GGUF` | GGUF | [link](https://huggingface.co/0xSero/Qwen3.6-35B-GGUF) | ## License & citation License inherited from the base model. ```bibtex @misc{lasby2025reap, title = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression}, author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa}, year = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv} } ``` ## Sponsors Made possible by **NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle**.