YOLO 26
Collection
5 items • Updated • 2
How to use mlx-community/YOLO26x-OptiQ-6bit with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir YOLO26x-OptiQ-6bit mlx-community/YOLO26x-OptiQ-6bit
Mixed-precision quantized YOLO26x for Apple Silicon via optiq
This is a mixed-precision quantized version of YOLO26x in MLX format, optimized with mlx-optiq for Apple Silicon inference via yolo-mlx.
| Property | Value |
|---|---|
| Target BPW | 6.0 |
| Achieved BPW | 6.00 |
| Layers at 4-bit | 16 |
| Layers at 8-bit | 174 |
| Original size | 225.5 MB |
| Quantized size | 50.6 MB |
| Compression | 4.5x |
| Model | Total Detections | Avg/Image |
|---|---|---|
| optiq 6-bit | 780 | 6.1 |
| Original (FP32) | 789 | 6.2 |
Detection delta: -9 (-1.1%) at 4.5x compression.
Requires mlx-optiq and yolo-mlx:
pip install mlx-optiq yolo-mlx
from optiq.models.yolo import load_quantized_yolo
model = load_quantized_yolo("mlx-community/YOLO26x-OptiQ-6bit")
results = model.predict("image.jpg")
optiq measures each conv layer's sensitivity via KL divergence on detection outputs, then assigns optimal per-layer bit-widths using greedy knapsack optimization. Sensitive layers (detection head, feature pyramid) get 8-bit precision while robust backbone layers get 4-bit.
For more details on the methodology and results, see: Not All Layers Are Equal
Quantized
Base model
Ultralytics/YOLO26