---
license: apache-2.0
library_name: gguf
pipeline_tag: text-generation
base_model:
- DJLougen/Qwable-5-27B-Coder
base_model_relation: quantized
language:
- en
tags:
- gguf
- llama.cpp
- qwen
- qwen3_6
- qwen3_5
- coder
- coding-agent
- agentic-coding
- tool-use
- repository-work
- terminal-workflows
- long-context
- imatrix
- qlora
---
# Qwable-5-27B-Coder-GGUF
**Qwable-5-27B-Coder-GGUF** packages the Qwable coder-agent tune for llama.cpp, Ollama, and local workstation inference. It comes from a Qwen3.6-based model trained first on **Claude Fable 5 traces**, then continued on **Kimi 2.7 Coder traces**.
Use this repo when you want Qwable's coding-agent behavior in GGUF form: repository inspection, patch planning, terminal feedback, verifier recovery, and long-context coding prompts.
## Quant menu
| File | Quant | Approx. size | Best for |
| --- | ---: | ---: | --- |
| `Qwable-5-27B-Coder-Q8_0.gguf` | Q8_0 | 28.6 GB | quality checks, quant comparisons, high-memory local serving |
| `Qwable-5-27B-Coder-Q4_K_M.gguf` | Q4_K_M | 16.5 GB | default local starting point |
| `Qwable-5-27B-Coder-IQ1_S.gguf` | IQ1_S | 7.1 GB | tight memory budgets; expect quality tradeoffs |
`IQ1_S` uses an importance matrix computed on the training traces.
## Model facts
| Attribute | Details |
| --- | --- |
| GGUF repo | `DJLougen/Qwable-5-27B-Coder-GGUF` |
| Source checkpoint | [`DJLougen/Qwable-5-27B-Coder`](https://huggingface.co/DJLougen/Qwable-5-27B-Coder) |
| Upstream base | [`unsloth/Qwen3.6-27B`](https://huggingface.co/unsloth/Qwen3.6-27B) |
| Runtime target | llama.cpp-compatible local inference |
| Architecture tag | `qwen3_5` |
| Scope | Text tower only; no vision sidecar in this repo |
| Training signal | Claude Fable 5 traces, then Kimi 2.7 Coder traces |
| License | Apache-2.0 |
```text
BF16 source checkpoint
-> GGUF conversion
-> Q8_0: quality reference
-> Q4_K_M: normal local use
-> IQ1_S: smallest imatrix build
```
Early maintainer runs show the source checkpoint outperforming the base model on a private coder benchmark. Public benchmark details are not posted yet, so treat that as early maintainer signal rather than a reproducible leaderboard claim.
## Quickstart
Requires a llama.cpp build with `qwen3_5` support.
Run a local OpenAI-compatible server:
```bash
llama-server -hf DJLougen/Qwable-5-27B-Coder-GGUF:Q4_K_M \
--jinja -ngl 99 -fa -c 32768 \
--temp 1.0 --top-p 0.95 --top-k 20
```
Run with Ollama:
```bash
ollama run hf.co/DJLougen/Qwable-5-27B-Coder-GGUF:Q4_K_M
```
Download one file:
```bash
hf download DJLougen/Qwable-5-27B-Coder-GGUF \
Qwable-5-27B-Coder-Q4_K_M.gguf \
--local-dir .
```
## Choosing a file
- Start with `Q4_K_M` unless you are explicitly testing quality ceilings or memory floors.
- Use `Q8_0` for comparisons against the source checkpoint or high-memory local serving.
- Use `IQ1_S` only when the model otherwise will not fit; verify quality on your own tasks.
- Keep prompts concrete: include repository context, exact errors, constraints, and verifier commands.
## Related releases
- Source BF16 Transformers checkpoint: [`DJLougen/Qwable-5-27B-Coder`](https://huggingface.co/DJLougen/Qwable-5-27B-Coder)
- NVFP4 ModelOpt checkpoint: [`DJLougen/Qwable-5-27B-Coder-NVFP4`](https://huggingface.co/DJLougen/Qwable-5-27B-Coder-NVFP4)
## Limitations
- Public benchmark tables are pending.
- Low-bit GGUF quantization can reduce instruction following, code precision, and tool-call reliability.
- This repo contains text GGUF files only; it is not the full multimodal Transformers checkpoint.
- Long-context behavior depends on llama.cpp build, hardware, KV cache settings, and prompt layout.
- Safety behavior is inherited from the base model and fine-tuning data; no separate safety alignment claim is made here.
## License
Released under Apache-2.0, following the upstream base model license metadata.