Omnilingual ASR — CTC 7B (MLX 8-bit)

MLX-compatible 8-bit quantization of Meta's Omnilingual ASR CTC-7B model for on-device inference on Apple Silicon (M3 Pro / M4 Pro (16+ GB unified memory) recommended). Trades ~1 GB of extra disk versus CTC-1B 4-bit for measurably better accuracy on low-resource languages per Meta's published FLEURS results.

Omnilingual ASR is a wav2vec 2.0-style encoder-only model with a linear CTC head, trained by Meta for speech recognition across 1,600+ languages. The CTC variant is language-agnostic at inference time.

Model


Parameters	~7 B
Format	MLX safetensors (quantized linear layers + fp16 features)
Quantization	8-bit per-group min-max, group size 64
Sample rate	16 kHz (raw waveform input)
Frame rate	50 fps
Max duration	40 s
Languages	1,600+
Vocabulary	10,288 SentencePiece tokens

Full architecture details (num_layers / model_dim / ffn_dim) are in config.json.

Files

File	Description
`model.safetensors`	8-bit quantized transformer weights + fp16 conv frontend
`tokenizer.model`	SentencePiece tokenizer
`config.json`	Architecture + quantization metadata

Usage

import mlx.core as mx
from safetensors import safe_open

weights = {}
with safe_open("model.safetensors", framework="mlx") as f:
    for k in f.keys():
        weights[k] = f.get_tensor(k)

Swift inference is provided by speech-swift.

Source

Upstream model: facebook/omniASR-CTC-7B
Paper: Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Meta blog: Omnilingual ASR announcement

License

Apache 2.0 (inherited from upstream).

Guide: soniqo.audio/guides/omnilingual
Docs: soniqo.audio
GitHub: soniqo/speech-swift

Downloads last month: 66

Safetensors

Model size

2B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for aufklarer/Omnilingual-ASR-CTC-7B-MLX-8bit

Base model

facebook/omniASR-CTC-7B

Finetuned

(2)

this model

Collection including aufklarer/Omnilingual-ASR-CTC-7B-MLX-8bit

MLX Speech Models

Collection

Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 57 items • Updated about 6 hours ago • 5

Paper for aufklarer/Omnilingual-ASR-CTC-7B-MLX-8bit

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Paper • 2511.09690 • Published Nov 12, 2025 • 1

aufklarer
/

Omnilingual-ASR-CTC-7B-MLX-8bit