Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx

MLX quantization of empero-ai/Qwythos-9B-Claude-Mythos-5-1M for Apple Silicon.

Note — text tower only. The source model is a Qwen3.5-VL multimodal model (Qwen3_5ForConditionalGeneration, with a vision encoder). This MLX conversion contains only the text/language tower — the vision encoder weights are not included, so this is a text-only model and does not accept image or video input. The text reasoning the original is benchmarked for (GSM8K, MMLU) is unaffected.

Variant: Block float MX FP8
Disk size: 8826 MB
Quantized by: sahilchachra

Benchmark results

Evaluated on Apple M5 Pro with MLX. Model loaded once; performance and quality measured in a single pass.

Performance

This model FP16 baseline
Decode tok/s (avg, long traces) 30.67 N/A
Peak memory (GB) 9.599 N/A
Disk size (MB) 8826 17969

Quality

Benchmark This model FP16 baseline n
GSM8K (math, accuracy) 100.0% N/A 50
MMLU (knowledge, accuracy) 80.0% N/A 50

Context scaling (decode tok/s)

Context length Decode tok/s
~128 tokens 33.7
~256 tokens 33.6
~512 tokens 33.6
~1024 tokens 33.5

Usage

pip install mlx-lm
from mlx_lm import load, generate

model, tokenizer = load("sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx")
response = generate(model, tokenizer, prompt="Your prompt here", max_tokens=256, verbose=True)

All variants in this collection

Model Variant
sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp4-mlx Block float MX FP4
sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx Block float MX FP8 ← this model
sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-optiq-5bpw-mlx OptiQ mixed-precision (target 5.0 bpw)

Notes

Original model

See empero-ai/Qwythos-9B-Claude-Mythos-5-1M for full model details and intended use.

Downloads last month
14
Safetensors
Model size
9B params
Tensor type
U8
·
U32
·
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx

Finetuned
Qwen/Qwen3.5-9B
Quantized
(12)
this model

Collection including sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx