DJLougen's picture
Redesign GGUF model card
763ee7d verified
|
Raw
History Blame
4.57 kB
metadata
license: apache-2.0
library_name: gguf
pipeline_tag: text-generation
base_model:
  - DJLougen/Qwable-5-27B-Coder
base_model_relation: quantized
language:
  - en
tags:
  - gguf
  - llama.cpp
  - qwen
  - qwen3_6
  - qwen3_5
  - coder
  - coding-agent
  - agentic-coding
  - tool-use
  - repository-work
  - terminal-workflows
  - long-context
  - imatrix
  - qlora

Qwable-5-27B-Coder banner

Qwable GGUF quant runway

Qwable-5-27B-Coder-GGUF

Qwable-5-27B-Coder-GGUF packages the Qwable coder-agent tune for llama.cpp, Ollama, and local workstation inference. It comes from a Qwen3.6-based model trained first on Claude Fable 5 traces, then continued on Kimi 2.7 Coder traces.

Use this repo when you want Qwable's coding-agent behavior in GGUF form: repository inspection, patch planning, terminal feedback, verifier recovery, and long-context coding prompts.

Support on Ko-fi

Quant menu

File Quant Approx. size Best for
Qwable-5-27B-Coder-Q8_0.gguf Q8_0 28.6 GB quality checks, quant comparisons, high-memory local serving
Qwable-5-27B-Coder-Q4_K_M.gguf Q4_K_M 16.5 GB default local starting point
Qwable-5-27B-Coder-IQ1_S.gguf IQ1_S 7.1 GB tight memory budgets; expect quality tradeoffs

IQ1_S uses an importance matrix computed on the training traces.

Model facts

Attribute Details
GGUF repo DJLougen/Qwable-5-27B-Coder-GGUF
Source checkpoint DJLougen/Qwable-5-27B-Coder
Upstream base unsloth/Qwen3.6-27B
Runtime target llama.cpp-compatible local inference
Architecture tag qwen3_5
Scope Text tower only; no vision sidecar in this repo
Training signal Claude Fable 5 traces, then Kimi 2.7 Coder traces
License Apache-2.0
BF16 source checkpoint
  -> GGUF conversion
      -> Q8_0: quality reference
      -> Q4_K_M: normal local use
      -> IQ1_S: smallest imatrix build

Early maintainer runs show the source checkpoint outperforming the base model on a private coder benchmark. Public benchmark details are not posted yet, so treat that as early maintainer signal rather than a reproducible leaderboard claim.

Quickstart

Requires a llama.cpp build with qwen3_5 support.

Run a local OpenAI-compatible server:

llama-server -hf DJLougen/Qwable-5-27B-Coder-GGUF:Q4_K_M \
  --jinja -ngl 99 -fa -c 32768 \
  --temp 1.0 --top-p 0.95 --top-k 20

Run with Ollama:

ollama run hf.co/DJLougen/Qwable-5-27B-Coder-GGUF:Q4_K_M

Download one file:

hf download DJLougen/Qwable-5-27B-Coder-GGUF \
  Qwable-5-27B-Coder-Q4_K_M.gguf \
  --local-dir .

Choosing a file

  • Start with Q4_K_M unless you are explicitly testing quality ceilings or memory floors.
  • Use Q8_0 for comparisons against the source checkpoint or high-memory local serving.
  • Use IQ1_S only when the model otherwise will not fit; verify quality on your own tasks.
  • Keep prompts concrete: include repository context, exact errors, constraints, and verifier commands.

Related releases

Limitations

  • Public benchmark tables are pending.
  • Low-bit GGUF quantization can reduce instruction following, code precision, and tool-call reliability.
  • This repo contains text GGUF files only; it is not the full multimodal Transformers checkpoint.
  • Long-context behavior depends on llama.cpp build, hardware, KV cache settings, and prompt layout.
  • Safety behavior is inherited from the base model and fine-tuning data; no separate safety alignment claim is made here.

License

Released under Apache-2.0, following the upstream base model license metadata.