Image-Text-to-Text
MLX
Safetensors
multilingual
deepseekocr
mlx-vlm
ocr
vision-language
baidu
quantized
4-bit precision
affine
conversational
Instructions to use mikoy92/Unlimited-OCR-4bit-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use mikoy92/Unlimited-OCR-4bit-mlx with MLX:
# Make sure mlx-vlm is installed # pip install --upgrade mlx-vlm from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model, processor = load("mikoy92/Unlimited-OCR-4bit-mlx") config = load_config("mikoy92/Unlimited-OCR-4bit-mlx") # Prepare input image = ["http://images.cocodataset.org/val2017/000000039769.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=1 ) # Generate output output = generate(model, processor, formatted_prompt, image) print(output) - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Unlimited-OCR 4-bit MLX
This is a 4-bit affine MLX quantization of mikoy92/Unlimited-OCR-bf16-mlx, converted with mlx-vlm.
Quantization settings:
- mode:
affine - bits:
4 - group size:
64 - observed effective bits per weight during conversion:
5.883
Because this is a vision-language OCR model, mlx-vlm does not aggressively quantize every multimodal tensor; the effective bits-per-weight can be higher than exactly 4-bit.
Usage
pip install -U mlx-vlm
mlx_vlm.generate \
--model mikoy92/Unlimited-OCR-4bit-mlx \
--image /path/to/image.png \
--prompt "Extract all readable text from this image." \
--max-tokens 512 \
--temperature 0
Validation
Before upload, this checkpoint was loaded locally with mlx_vlm.generate and produced OCR text/table output on a document-image smoke test.
- Downloads last month
- 45
Model size
0.9B params
Tensor type
BF16
·
U32 ·
Hardware compatibility
Log In to add your hardware
4-bit