Instructions to use sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx
MLX quantization of empero-ai/Qwythos-9B-Claude-Mythos-5-1M for Apple Silicon.
Note — text tower only. The source model is a Qwen3.5-VL multimodal model (
Qwen3_5ForConditionalGeneration, with a vision encoder). This MLX conversion contains only the text/language tower — the vision encoder weights are not included, so this is a text-only model and does not accept image or video input. The text reasoning the original is benchmarked for (GSM8K, MMLU) is unaffected.
Variant: Block float MX FP8
Disk size: 8826 MB
Quantized by: sahilchachra
Benchmark results
Evaluated on Apple M5 Pro with MLX. Model loaded once; performance and quality measured in a single pass.
Performance
| This model | FP16 baseline | |
|---|---|---|
| Decode tok/s (avg, long traces) | 30.67 | N/A |
| Peak memory (GB) | 9.599 | N/A |
| Disk size (MB) | 8826 | 17969 |
Quality
| Benchmark | This model | FP16 baseline | n |
|---|---|---|---|
| GSM8K (math, accuracy) | 100.0% | N/A | 50 |
| MMLU (knowledge, accuracy) | 80.0% | N/A | 50 |
Context scaling (decode tok/s)
| Context length | Decode tok/s |
|---|---|
| ~128 tokens | 33.7 |
| ~256 tokens | 33.6 |
| ~512 tokens | 33.6 |
| ~1024 tokens | 33.5 |
Usage
pip install mlx-lm
from mlx_lm import load, generate
model, tokenizer = load("sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx")
response = generate(model, tokenizer, prompt="Your prompt here", max_tokens=256, verbose=True)
All variants in this collection
| Model | Variant |
|---|---|
| sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp4-mlx | Block float MX FP4 |
| sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx | Block float MX FP8 ← this model |
| sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-optiq-5bpw-mlx | OptiQ mixed-precision (target 5.0 bpw) |
Notes
- Requires Apple Silicon (M1 or later) with MLX
- Benchmarks run on Apple M5 Pro, 24 GB unified memory
- License: see empero-ai/Qwythos-9B-Claude-Mythos-5-1M for the original model's license
Original model
See empero-ai/Qwythos-9B-Claude-Mythos-5-1M for full model details and intended use.
- Downloads last month
- 14
8-bit
Model tree for sahilchachra/Qwythos-9B-Claude-Mythos-5-1M-mxfp8-mlx
Base model
Qwen/Qwen3.5-9B-Base