Omi Med STT v1 MLX

Full-precision Apple Silicon / MLX export of Omi Med STT v1.

For most Mac users, the smaller q8 export is the recommended default. Use this repo when you specifically want the full MLX weights.

Quickstart

pip install -U "omi-med-stt[mlx]"
omi-med-stt audio.wav --runtime mlx --model omi-health/omi-med-stt-v1-mlx

Evaluation

Full evaluation details: omi.health/research/omi-med-stt. Benchmark: 7.18h of real and synthetic clinical speech across dialogue, dictation, medication review, procedures/devices/tests, and general speech. Speed is shown as time to process one hour of audio; lower is faster.

NeMo vs Open / Local Models

Local GPU baselines were run on A10 where applicable; VibeVoice-ASR 9B used H100.

Model WER M-WER Drug M-WER Medical Recall Speed: time / 1 hour audio (formula-derived x realtime)
VibeVoice-ASR 9B 11.10% 1.78% 1.36% 98.71% 5m 20s (11.2x)
Omi Med STT v1 NeMo 8.30% 2.37% 4.75% 97.95% 25s (146.3x)
Qwen3 ASR 1.7B 10.72% 3.13% 6.11% 97.21% 44s (81.1x)
Whisper Large v3 Turbo (A10) 11.98% 3.93% 5.88% 96.45% 1m 19s (45.8x)
Cohere Transcribe 03-2026 14.88% 5.05% 11.09% 95.16% 25s (146.3x)
Parakeet TDT 0.6B v3 15.26% 8.01% 9.50% 96.34% 23s (157.9x)
Parakeet TDT 0.6B v2 base 16.45% 8.36% 8.60% 96.20% 23s (153.8x)

Runtime Artifacts

Same internal evaluation as the canonical checkpoint.

Artifact WER M-WER Drug M-WER Medical Recall Speed: time / 1 hour audio (formula-derived x realtime)
NeMo canonical 8.30% 2.37% 4.75% 97.95% 25s (146.3x)
MLX full precision 8.59% 2.65% 5.20% 97.70% 56s (64.5x)
MLX q8 8.61% 2.75% 5.20% 97.63% 53s (67.4x)

The full MLX export is slightly ahead of q8 on M-WER, but q8 is much smaller and is the default Mac artifact.

Compatibility

This is not a drop-in parakeet-mlx checkpoint. Omi Med STT v1 includes a medical adapter, and the supported Mac path is the omi-med-stt CLI.

Links

Safety

Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage, prescribing, or clinical decision model, and it is not clinically validated. Transcripts must be reviewed before any clinical use.

Downloads last month
39
Safetensors
Model size
0.6B params
Tensor type
F32
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for omi-health/omi-med-stt-v1-mlx

Finetuned
(2)
this model