--- license: apache-2.0 base_model: Qwen/Qwen3.5-35B-A3B base_model_relation: quantized library_name: transformers tags: - dashq - quantized - post-training-quantization - int3 --- ![DASH-Q](https://raw.githubusercontent.com/JaeminK/dashq/main/assets/dashq_banner.png) # Qwen3.5-35B-A3B-DASHQ-INT3-g128 > **DASH-Q** — Diagonal-Aware Shrinkage for Robust PTQ. > `INT3` · group size 128 · **17.4800 GB** (from 71.9039 GB — **4.1x smaller**) DASH-Q checkpoints load with the lightweight DASH-Q runtime — linear layers are packed `PackedQuantizedLinear` modules, not plain Transformers weights. ## Install ```bash pip install git+https://github.com/JaeminK/dashq.git ``` ## Load ```python from dashq import load_quantized model, tokenizer = load_quantized("jkim96/Qwen3.5-35B-A3B-DASHQ-INT3-g128", device_map="auto") ``` ## Quantization | Field | Value | | --- | --- | | Base model | `Qwen/Qwen3.5-35B-A3B` | | Precision | INT3, group size 128 | | Scale / zero dtype | float16 | | Calibration | wikitext2, 128 samples x 2048 | | Size | 17.4800 GB · original 71.9039 GB · 4.1x compression | ## Benchmarks Full zero-shot / few-shot results for every DASH-Q checkpoint: **[github.com/JaeminK/dashq#benchmarks](https://github.com/JaeminK/dashq#benchmarks)**