Qwen3.6-28B-GGUF / README.md
0xSero's picture
Standardize model card (template rollout)
cd812b6 verified
|
raw
history blame contribute delete
1.5 kB
metadata
license: mit
pipeline_tag: text-generation
base_model_relation: quantized
library_name: gguf
tags:
  - gguf
  - qwen3.6
  - reap

Support this work → · X · GitHub · REAP paper · Cerebras REAP

Qwen3.6-28B-GGUF

GGUF quantization of the base model.

At a glance

Base model
Format GGUF
Total params 28B
Active / token 3B
Experts / layer
Layers
Hidden size
Context
On-disk size 147 GB

Which variant should I pick?

Variant Format Link
Qwen3.6-28B BF16 link
Qwen3.6-28B-GGUF (this) GGUF link
Qwen3.6-35B-GGUF GGUF link

License & citation

License inherited from the base model.

@misc{lasby2025reap,
  title  = {REAP the Experts: Why Pruning Prevails for One-Shot MoE Compression},
  author = {Mike Lasby and Ivan Lazarevich and Nish Sinnadurai and Sean Lie and Yani Ioannou and Vithursan Thangarasa},
  year   = {2025}, eprint = {2510.13999}, archivePrefix = {arXiv}
}

Sponsors

Made possible by NVIDIA · TNG Technology · Lambda · Prime Intellect · Hot Aisle.