mlx-community/rfdetr-base-fp32

This model was converted to MLX format from RF-DETR (ICLR 2026) using mlx-vlm version 0.4.3.

RF-DETR is a real-time detection transformer achieving state-of-the-art performance on COCO.

Use with mlx

pip install -U mlx-vlm
from pathlib import Path
from PIL import Image
from mlx_vlm.utils import load_model
from mlx_vlm.models.rfdetr.processing_rfdetr import RFDETRProcessor
from mlx_vlm.models.rfdetr.generate import RFDETRPredictor

model = load_model(Path("mlx-community/rfdetr-base-fp32"))
processor = RFDETRProcessor.from_pretrained("mlx-community/rfdetr-base-fp32")
predictor = RFDETRPredictor(model, processor, score_threshold=0.3, nms_threshold=0.5)

result = predictor.predict(Image.open("image.jpg"))
for name, score, box in zip(result.class_names, result.scores, result.boxes):
    print(f"{name}: {score:.2f} [{box[0]:.0f}, {box[1]:.0f}, {box[2]:.0f}, {box[3]:.0f}]")

CLI

# Image
python -m mlx_vlm.models.rfdetr.generate --image photo.jpg --model mlx-community/rfdetr-base-fp32

# Video
python -m mlx_vlm.models.rfdetr.generate --video input.mp4 --model mlx-community/rfdetr-base-fp32

# Realtime camera
python -m mlx_vlm.models.rfdetr.generate --task realtime --model mlx-community/rfdetr-base-fp32

Model Details

Architecture DINOv2-small backbone + C2f projector + Deformable DETR decoder
Task Object detection (COCO 80 classes)
Parameters ~32M
Input resolution 560x560
Dtype float32
Inference (M4 Max) 32ms per image (31 FPS)

Reference

Downloads last month
11
Safetensors
Model size
32.2M params
Tensor type
F32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for mlx-community/rfdetr-base-fp32