richardyoung
/

olmOCR-2-7B-1025-MLX-8bit

+---
+license: apache-2.0
+base_model: allenai/olmOCR-2-7B-1025
+tags:
+- mlx
+- vision
+- ocr
+- quantized
+- apple-silicon
+---
+# olmOCR-2-7B-1025-MLX-8bit
+This is an 8-bit quantized version of [allenai/olmOCR-2-7B-1025](https://huggingface.co/allenai/olmOCR-2-7B-1025) optimized for Apple Silicon using MLX.
+## Model Description
+olmOCR-2 is a state-of-the-art OCR (Optical Character Recognition) vision-language model fine-tuned from Qwen2.5-VL-7B-Instruct. This 8-bit quantized version provides excellent quality with significantly reduced memory footprint.
+**Base Model:** allenai/olmOCR-2-7B-1025
+**Quantization:** 8-bit using MLX
+**Model Size:** 8.4 GB (down from ~14 GB BF16)
+**Size Reduction:** ~40%
+## Performance
+olmOCR-2 achieves **82.4 points on olmOCR-Bench**, representing state-of-the-art performance for real-world OCR of English-language digitized print documents. The model has been additionally fine-tuned using GRPO RL training to boost performance on:
+- Math equations
+- Tables
+- Complex layouts
+- Handwriting
+## Usage
+### Requirements
+```bash
+pip install mlx-vlm
+```
+### Basic Usage
+```python
+from mlx_vlm import load, generate
+from PIL import Image
+# Load the model
+model, processor = load("richardyoung/olmOCR-2-7B-1025-MLX-8bit")
+# Load your image
+image = Image.open("document.png")
+# Extract text
+prompt = "Extract all text from this image."
+output = generate(model, processor, image, prompt, max_tokens=2048)
+print(output)
+```
+### Command Line
+```bash
+python -m mlx_vlm.generate \
+  --model richardyoung/olmOCR-2-7B-1025-MLX-8bit \
+  --image document.png \
+  --prompt "Extract all text from this image." \
+  --max-tokens 2048
+```
+## Quantization Details
+- **Method:** MLX native quantization
+- **Bits:** 8-bit
+- **Group Size:** Default
+- **Recommended for:** Users who prioritize quality and have sufficient RAM (10GB+)
+## Model Variants
+| Variant | Size | Precision | Use Case |
+|---------|------|-----------|----------|
+| [8-bit](https://huggingface.co/richardyoung/olmOCR-2-7B-1025-MLX-8bit) | 8.4 GB | Highest | Best quality, more RAM |
+| [6-bit](https://huggingface.co/richardyoung/olmOCR-2-7B-1025-MLX-6bit) | 6.4 GB | High | Balanced quality/size |
+| [4-bit](https://huggingface.co/richardyoung/olmOCR-2-7B-1025-MLX-4bit) | 4.5 GB | Good | Smallest size, less RAM |
+## System Requirements
+- **Platform:** Apple Silicon (M1/M2/M3/M4)
+- **RAM:** 10+ GB recommended
+- **OS:** macOS 12.0+
+## Limitations
+- Optimized primarily for English-language printed documents
+- May have reduced performance on handwritten text compared to printed text
+- Requires Apple Silicon hardware for optimal performance
+## Citation
+```bibtex
+@article{olmocr2,
+  title={olmOCR 2: Unit test rewards for document OCR},
+  author={Allen Institute for AI},
+  year={2025}
+}
+```
+## License
+Apache 2.0 (inherited from base model)
+## Acknowledgements
+- Base model by [Allen Institute for AI](https://allenai.org/)
+- Quantized for MLX by richardyoung
+- Built with [MLX-VLM](https://github.com/Blaizzy/mlx-vlm)
+---
+*Generated with Claude Code*