Qwen3.5-9B HLWQ - a caiovicentino1 Collection

caiovicentino1 's Collections

HLWQ Large MoE (100B+)

HLWQ Video & Diffusion Models

HLWQ Gemma Models

Nemotron 30B — Consumer GPU Inference

HLWQ Unified (Weights Q5 + KV Cache Q3)

HLWQ MLX (Apple Silicon)

Large Models (27B-35B) HLWQ

Qwen3.5-4B EOQ Quantized

Qwen2.5 EOQ Quantized

Qwen3.5-9B HLWQ

EOQ Compressed Models

Qwen3.5-27B HLWQ

Qwen3.5-9B HLWQ

updated Apr 13

Qwen3.5-9B · HLWQ Q5 · beats torchao INT4 on PPL (6.56 vs 6.68) · CUDA + MLX

caiovicentino1/Qwen3.5-9B-HLWQ-Q5

Text Generation • 9B • Updated Apr 13 • 231 • 3
caiovicentino1/Qwen3.5-9B-HLWQ-MLX-4bit

Text Generation • 1B • Updated Apr 13 • 167 • 3
caiovicentino1/Qwen3.5-9B-HLWQ-Engine-v4

Text Generation • 7B • Updated Apr 13 • 9
caiovicentino1/Qwen3.5-9B-EOQ-v3

Text Generation • 5B • Updated Apr 6 • 72 • 1