Instructions to use omi-health/omi-med-stt-v1-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use omi-health/omi-med-stt-v1-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir omi-med-stt-v1-mlx omi-health/omi-med-stt-v1-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Omi Med STT v1 MLX
Full-precision Apple Silicon / MLX export of Omi Med STT v1.
For most Mac users, the smaller q8 export is the recommended default. Use this repo when you specifically want the full MLX weights.
Quickstart
pip install -U "omi-med-stt[mlx]"
omi-med-stt audio.wav --runtime mlx --model omi-health/omi-med-stt-v1-mlx
Evaluation
Full evaluation details: omi.health/research/omi-med-stt. Benchmark: 7.18h of real and synthetic clinical speech across dialogue, dictation, medication review, procedures/devices/tests, and general speech. Speed is shown as time to process one hour of audio; lower is faster.
NeMo vs Open / Local Models
Local GPU baselines were run on A10 where applicable; VibeVoice-ASR 9B used H100.
| Model | WER | M-WER | Drug M-WER | Medical Recall | Speed: time / 1 hour audio (formula-derived x realtime) |
|---|---|---|---|---|---|
| VibeVoice-ASR 9B | 11.10% | 1.78% | 1.36% | 98.71% | 5m 20s (11.2x) |
| Omi Med STT v1 NeMo | 8.30% | 2.37% | 4.75% | 97.95% | 25s (146.3x) |
| Qwen3 ASR 1.7B | 10.72% | 3.13% | 6.11% | 97.21% | 44s (81.1x) |
| Whisper Large v3 Turbo (A10) | 11.98% | 3.93% | 5.88% | 96.45% | 1m 19s (45.8x) |
| Cohere Transcribe 03-2026 | 14.88% | 5.05% | 11.09% | 95.16% | 25s (146.3x) |
| Parakeet TDT 0.6B v3 | 15.26% | 8.01% | 9.50% | 96.34% | 23s (157.9x) |
| Parakeet TDT 0.6B v2 base | 16.45% | 8.36% | 8.60% | 96.20% | 23s (153.8x) |
Runtime Artifacts
Same internal evaluation as the canonical checkpoint.
| Artifact | WER | M-WER | Drug M-WER | Medical Recall | Speed: time / 1 hour audio (formula-derived x realtime) |
|---|---|---|---|---|---|
| NeMo canonical | 8.30% | 2.37% | 4.75% | 97.95% | 25s (146.3x) |
| MLX full precision | 8.59% | 2.65% | 5.20% | 97.70% | 56s (64.5x) |
| MLX q8 | 8.61% | 2.75% | 5.20% | 97.63% | 53s (67.4x) |
The full MLX export is slightly ahead of q8 on M-WER, but q8 is much smaller and is the default Mac artifact.
Compatibility
This is not a drop-in parakeet-mlx checkpoint. Omi Med STT v1 includes a
medical adapter, and the supported Mac path is the omi-med-stt CLI.
Links
- Canonical model:
omi-health/omi-med-stt-v1 - Mac q8 default:
omi-health/omi-med-stt-v1-mlx-q8 - CPU GGUF export:
omi-health/omi-med-stt-v1-gguf - Runtime CLI:
Omi-Health/omi-med-stt-runtime - Broader evaluation and product context: omi.health/research/omi-med-stt
Safety
Omi Med STT v1 is speech-to-text only. It is not a diagnostic, triage, prescribing, or clinical decision model, and it is not clinically validated. Transcripts must be reviewed before any clinical use.
- Downloads last month
- 39
Quantized