---
license: apache-2.0
base_model: Qwen/Qwen3.5-35B-A3B
base_model_relation: quantized
library_name: transformers
tags:
- dashq
- quantized
- post-training-quantization
- int3
---

![DASH-Q](https://raw.githubusercontent.com/JaeminK/dashq/main/assets/dashq_banner.png)

# Qwen3.5-35B-A3B-DASHQ-INT3-g128

> **DASH-Q** — Diagonal-Aware Shrinkage for Robust PTQ.
> `INT3` · group size 128 · **17.4800 GB** (from 71.9039 GB — **4.1x smaller**)

DASH-Q checkpoints load with the lightweight DASH-Q runtime — linear layers are packed `PackedQuantizedLinear` modules, not plain Transformers weights.

## Install

```bash
pip install git+https://github.com/JaeminK/dashq.git
```

## Load

```python
from dashq import load_quantized

model, tokenizer = load_quantized("jkim96/Qwen3.5-35B-A3B-DASHQ-INT3-g128", device_map="auto")
```

## Quantization

| Field | Value |
| --- | --- |
| Base model | `Qwen/Qwen3.5-35B-A3B` |
| Precision | INT3, group size 128 |
| Scale / zero dtype | float16 |
| Calibration | wikitext2, 128 samples x 2048 |
| Size | 17.4800 GB  ·  original 71.9039 GB  ·  4.1x compression |

## Benchmarks

Full zero-shot / few-shot results for every DASH-Q checkpoint:
**[github.com/JaeminK/dashq#benchmarks](https://github.com/JaeminK/dashq#benchmarks)**