Qwable-3.6-27b — iMatrix GGUF

The first iMatrix GGUF for Qwable-3.6-27b — Q2_K through Q8_0 with importance matrix calibration.

Qwable-3.6-27b is a fine-tune of Qwen 27B trained on Fable 5-style reasoning traces — it thinks before answering, with structured deliberate responses optimized for code and technical tasks.

These GGUFs are produced from the F16 source using importance matrix (iMatrix) calibration on 2M tokens of wikitext-103. iMatrix identifies which weights matter most during inference and protects them during quantization — the result is noticeably better coherence at Q2/Q3/Q4.


Quick Start

llama.cpp

llama-cli -hf liodon-ai/Qwable-3.6-27b-imatrix-GGUF:Q4_K_M

LM Studio / Jan

Search liodon-ai/Qwable-3.6-27b-imatrix-GGUF and pick your quant.


Available Quants

Quant Size VRAM Notes
Q2_K 10.9 GB 9 GB tiniest — runs almost anywhere, iMatrix-improved
Q3_K_M 13.5 GB 11 GB great for 8GB VRAM, iMatrix-improved
Q4_K_M 16.8 GB 14 GB sweet spot (recommended), iMatrix-improved
Q5_K_M 19.5 GB 18 GB high quality, iMatrix-improved
Q6_K 22.4 GB 20 GB near-lossless, iMatrix-improved
Q8_0 29.0 GB 28 GB basically full quality

Why iMatrix for Qwable?

Qwable uses chain-of-thought reasoning — it emits long <think> traces before answering. At low-bit quantization, coherence over long sequences matters more than for simple Q&A models. iMatrix protects the weights that sustain long reasoning chains, giving noticeably better output at Q2_K and Q3_K_M compared to standard quantization.


What is iMatrix?

Standard quantization rounds all weights equally. iMatrix:

  1. Runs calibration text through the full-precision model
  2. Measures which weights activate most (the "importance matrix")
  3. Allocates more precision to important weights, less to unimportant ones

Same file size. Better output. Most noticeable at Q2/Q3/Q4.


Calibration

Importance matrix computed from 2M tokens of wikitext-103 — 128 calibration chunks.


Source Model

  • Original: Mia-AiLab/Qwable-3.6-27b — 22.9K downloads
  • Architecture: Qwen3.5 27B fine-tuned on Fable 5 reasoning traces
  • Strengths: Code, debugging, technical reasoning, structured tasks
  • License: MIT
Downloads last month
326
GGUF
Model size
27B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for liodon-ai/Qwable-3.6-27b-imatrix-GGUF

Base model

Qwen/Qwen3.6-27B
Quantized
(5)
this model