How to use from
Lemonade
Pull the model
# Download Lemonade from https://lemonade-server.ai/
lemonade pull FreedomAISVR/Qwable-v1-MXFP4-MOE-GGUF:F16
Run and chat with the model
lemonade run user.Qwable-v1-MXFP4-MOE-GGUF-F16
List all available models
lemonade list
Quick Links

Qwable-v1 MXFP4 MoE GGUF

GGUF quantization of lordx64/Qwable-v1 โ€” an agentic coding model built by layering Claude Fable-5 tool-use behavior on top of a Claude Opus 4.7 reasoning distill of Qwen3.6-35B-A3B.

Model Details

  • Architecture: Qwen3.5 MoE, 41 blocks (40 layers + 1 MTP head), 256 experts (8 active/token)
  • Active Parameters: ~3B
  • Context: 262,144 tokens
  • Vision: Yes (27-layer SigLIP ViT)
  • License: AGPL-3.0
  • Base: Qwen3.6-35B-A3B โ†’ Opus 4.7 reasoning distill โ†’ Fable-5 agentic SFT

What's Included

File Type Size BPW
qwable-v1-mxfp4_moe.gguf MXFP4 (experts) + Q8_0 (non-experts) ~18.87 GB 4.56
mmproj-qwable-v1-f16.gguf Vision projector (F16) ~0.88 GB F16

Quantization Details

MXFP4_MOE

  • MoE expert weights (ffn_down_exps, ffn_gate_exps, ffn_up_exps) quantized to MXFP4
  • Non-expert weights (attention, shared experts, norms) quantized to Q8_0
  • Router weights kept at F32
  • Vision encoder and projector kept at F16 (not quantized)

Usage

# With llama.cpp (vision + text)
./llama-server -m qwable-v1-mxfp4_moe.gguf --mmproj mmproj-qwable-v1-f16.gguf --host 0.0.0.0 --port 8080

# Text-only (no vision)
./llama-cli -m qwable-v1-mxfp4_moe.gguf -p "Hello, how are you?"

Agentic Tool-Use

Qwable-v1 emits <tool_use> XML when prompted with an agent-style system prompt:

system: You are a coding agent. When you need to read, write, edit, or run code,
emit XML tool calls in this exact format:
<tool_use name="X" id="toolu_01abc">
{"...": "..."}
</tool_use>

Without the agent prompt, the model falls back to the Opus 4.7 reasoning prior (markdown code blocks).

Credits

Downloads last month
602
GGUF
Model size
36B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support