How to use from
Unsloth Studio
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FreedomAISVR/Qwable-v1-MXFP4-MOE-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex
# Run unsloth studio
unsloth studio -H 0.0.0.0 -p 8888
# Then open http://localhost:8888 in your browser
# Search for FreedomAISVR/Qwable-v1-MXFP4-MOE-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required
# Open https://huggingface.co/spaces/unsloth/studio in your browser
# Search for FreedomAISVR/Qwable-v1-MXFP4-MOE-GGUF to start chatting
Quick Links

Qwable-v1 MXFP4 MoE GGUF

GGUF quantization of lordx64/Qwable-v1 โ€” an agentic coding model built by layering Claude Fable-5 tool-use behavior on top of a Claude Opus 4.7 reasoning distill of Qwen3.6-35B-A3B.

Model Details

  • Architecture: Qwen3.5 MoE, 41 blocks (40 layers + 1 MTP head), 256 experts (8 active/token)
  • Active Parameters: ~3B
  • Context: 262,144 tokens
  • Vision: Yes (27-layer SigLIP ViT)
  • License: AGPL-3.0
  • Base: Qwen3.6-35B-A3B โ†’ Opus 4.7 reasoning distill โ†’ Fable-5 agentic SFT

What's Included

File Type Size BPW
qwable-v1-mxfp4_moe.gguf MXFP4 (experts) + Q8_0 (non-experts) ~18.87 GB 4.56
mmproj-qwable-v1-f16.gguf Vision projector (F16) ~0.88 GB F16

Quantization Details

MXFP4_MOE

  • MoE expert weights (ffn_down_exps, ffn_gate_exps, ffn_up_exps) quantized to MXFP4
  • Non-expert weights (attention, shared experts, norms) quantized to Q8_0
  • Router weights kept at F32
  • Vision encoder and projector kept at F16 (not quantized)

Usage

# With llama.cpp (vision + text)
./llama-server -m qwable-v1-mxfp4_moe.gguf --mmproj mmproj-qwable-v1-f16.gguf --host 0.0.0.0 --port 8080

# Text-only (no vision)
./llama-cli -m qwable-v1-mxfp4_moe.gguf -p "Hello, how are you?"

Agentic Tool-Use

Qwable-v1 emits <tool_use> XML when prompted with an agent-style system prompt:

system: You are a coding agent. When you need to read, write, edit, or run code,
emit XML tool calls in this exact format:
<tool_use name="X" id="toolu_01abc">
{"...": "..."}
</tool_use>

Without the agent prompt, the model falls back to the Opus 4.7 reasoning prior (markdown code blocks).

Credits

Downloads last month
602
GGUF
Model size
36B params
Architecture
qwen35moe
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support