Omi Med STT v1 MLX

Full-precision Apple Silicon / MLX export of Omi Med STT v1.

For most Mac users, the smaller q8 export is the recommended default. Use this repo when you specifically want the full MLX weights.

Quickstart

pip install -U "omi-med-stt[mlx]"
omi-med-stt audio.wav --runtime mlx --model omi-health/omi-med-stt-v1-mlx

Evaluation

Full evaluation details: omi.health/research/omi-med-stt. Benchmark: 7.18h of real and synthetic clinical speech across dialogue, dictation, medication review, procedures/devices/tests, and general speech. Speed is shown as time to process one hour of audio; lower is faster.

NeMo vs Open / Local Models

Local GPU baselines were run on A10 where applicable; VibeVoice-ASR 9B used H100.

Model	WER	M-WER	Drug M-WER	Medical Recall	Speed: time / 1 hour audio (formula-derived x realtime)
VibeVoice-ASR 9B	11.10%	1.78%	1.36%	98.71%	5m 20s (11.2x)
Omi Med STT v1 NeMo	8.30%	2.37%	4.75%	97.95%	25s (146.3x)
Qwen3 ASR 1.7B	10.72%	3.13%	6.11%	97.21%	44s (81.1x)
Whisper Large v3 Turbo (A10)	11.98%	3.93%	5.88%	96.45%	1m 19s (45.8x)
Cohere Transcribe 03-2026	14.88%	5.05%	11.09%	95.16%	25s (146.3x)
Parakeet TDT 0.6B v3	15.26%	8.01%	9.50%	96.34%	23s (157.9x)
Parakeet TDT 0.6B v2 base	16.45%	8.36%	8.60%	96.20%	23s (153.8x)

Runtime Artifacts

Same internal evaluation as the canonical checkpoint.

Artifact	WER	M-WER	Drug M-WER	Medical Recall	Speed: time / 1 hour audio (formula-derived x realtime)
NeMo canonical	8.30%	2.37%	4.75%	97.95%	25s (146.3x)
MLX full precision	8.59%	2.65%	5.20%	97.70%	56s (64.5x)
MLX q8	8.61%	2.75%	5.20%	97.63%	53s (67.4x)

The full MLX export is slightly ahead of q8 on M-WER, but q8 is much smaller and is the default Mac artifact.

Compatibility

This is not a drop-in parakeet-mlx checkpoint. Omi Med STT v1 includes a medical adapter, and the supported Mac path is the omi-med-stt CLI.

Safety

Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage, prescribing, or clinical decision model, and it is not clinically validated. Transcripts must be reviewed before any clinical use.

Downloads last month: 39

Safetensors

Model size

0.6B params

Tensor type

F32

MLX

Hardware compatibility

Quantized

Model tree for omi-health/omi-med-stt-v1-mlx

Base model

nvidia/parakeet-tdt-0.6b-v2

Finetuned

omi-health/omi-med-stt-v1