Omi Med STT v1 MLX q8

Apple Silicon q8 export of Omi Med STT v1.

This is the default Mac artifact used by the omi-med-stt CLI. It is much smaller than the full MLX export and keeps very similar benchmark quality.

Quickstart

pip install -U "omi-med-stt[mlx]"
omi-med-stt audio.wav

Explicit selection:

omi-med-stt audio.wav --runtime mlx --model omi-health/omi-med-stt-v1-mlx-q8

Evaluation

Full evaluation details: omi.health/research/omi-med-stt. Benchmark: 7.18h of real and synthetic clinical speech across dialogue, dictation, medication review, procedures/devices/tests, and general speech. Speed is shown as time to process one hour of audio; lower is faster.

NeMo vs Open / Local Models

Local GPU baselines were run on A10 where applicable; VibeVoice-ASR 9B used H100.

Model	WER	M-WER	Drug M-WER	Medical Recall	Speed: time / 1 hour audio (formula-derived x realtime)
VibeVoice-ASR 9B	11.10%	1.78%	1.36%	98.71%	5m 20s (11.2x)
Omi Med STT v1 NeMo	8.30%	2.37%	4.75%	97.95%	25s (146.3x)
Qwen3 ASR 1.7B	10.72%	3.13%	6.11%	97.21%	44s (81.1x)
Whisper Large v3 Turbo (A10)	11.98%	3.93%	5.88%	96.45%	1m 19s (45.8x)
Cohere Transcribe 03-2026	14.88%	5.05%	11.09%	95.16%	25s (146.3x)
Parakeet TDT 0.6B v3	15.26%	8.01%	9.50%	96.34%	23s (157.9x)
Parakeet TDT 0.6B v2 base	16.45%	8.36%	8.60%	96.20%	23s (153.8x)

Runtime Artifacts

Same internal evaluation as the canonical checkpoint.

Artifact	WER	M-WER	Drug M-WER	Medical Recall	Speed: time / 1 hour audio (formula-derived x realtime)
NeMo canonical	8.30%	2.37%	4.75%	97.95%	25s (146.3x)
MLX full precision	8.59%	2.65%	5.20%	97.70%	56s (64.5x)
MLX q8	8.61%	2.75%	5.20%	97.63%	53s (67.4x)

Why q8 is the Mac default: it is smaller than full precision, easy to download, and did not worsen Drug M-WER versus the full MLX export in this evaluation.

Compatibility

This is not a drop-in parakeet-mlx checkpoint. Omi Med STT v1 includes a medical adapter, and the supported Mac path is the omi-med-stt CLI.

Safety

Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage, prescribing, or clinical decision model, and it is not clinically validated. Transcripts must be reviewed before any clinical use.

Downloads last month: 109

Safetensors

Model size

0.2B params

Tensor type

F32

U32

MLX

Hardware compatibility

Quantized

Model tree for omi-health/omi-med-stt-v1-mlx-q8

Base model

nvidia/parakeet-tdt-0.6b-v2

Finetuned

omi-health/omi-med-stt-v1