Omnilingual ASR — CTC 1B (MLX 8-bit)

MLX-compatible 8-bit quantization of Meta's Omnilingual ASR CTC-1B model for on-device inference on Apple Silicon (M1/M2/M3/M4). Prefer this variant when you need the smallest possible WER regression from fp32 and can afford an extra ~460 MB compared to the 4-bit build.

Omnilingual ASR is a wav2vec 2.0-style encoder-only model with a linear CTC head, trained by Meta for speech recognition across 1,600+ languages. The CTC variant is language-agnostic at inference time.

Model


Parameters	1.01 B
Format	MLX safetensors (quantized linear layers + fp16 features)
Quantization	8-bit per-group min-max, group size 64
Encoder layers	48
Encoder dim	1280
Attention heads	20
FFN dim	5120
Sample rate	16 kHz (raw waveform input)
Frame rate	50 fps
Max duration	40 s
Languages	1,600+
Vocabulary	10,288 SentencePiece tokens

Files

File	Size	Description
`model.safetensors`	1006 MB	8-bit quantized transformer weights + fp16 conv frontend
`tokenizer.model`	1.2 MB	SentencePiece tokenizer
`config.json`	<1 KB	Architecture + quantization metadata

Architecture

Wav2Vec2FeatureExtractor (7-layer CNN, 320× downsample) → Linear 512→1280 → conv position encoder → 48× pre-norm Transformer encoder (dim 1280, 20 heads, ffn 5120) → LayerNorm → Linear CTC head (→ 10,288 tokens).

Performance

See the 4-bit variant for architecture notes and the 300M reference for FLEURS WER across en/fr/de/ar/hi. The 1B model is ~3× the encoder capacity and delivers correspondingly lower WER on low-resource languages.

Source

Upstream model: facebook/omniASR-CTC-1B
Paper: Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages
Meta blog: Omnilingual ASR announcement

License

Apache 2.0 (inherited from upstream).

Guide: soniqo.audio/guides/omnilingual
Docs: soniqo.audio
GitHub: soniqo/speech-swift

Downloads last month: 8

Safetensors

Model size

0.3B params

Tensor type

U32

F16

MLX

Hardware compatibility

Quantized

Model tree for aufklarer/Omnilingual-ASR-CTC-1B-MLX-8bit

Base model

facebook/omniASR-CTC-1B

Finetuned

(2)

this model

Collection including aufklarer/Omnilingual-ASR-CTC-1B-MLX-8bit

MLX Speech Models

Collection

Speech AI models for Apple Silicon via MLX. ASR, TTS, VAD, diarization, speaker embedding. • 57 items • Updated about 4 hours ago • 5

Paper for aufklarer/Omnilingual-ASR-CTC-1B-MLX-8bit

Omnilingual ASR: Open-Source Multilingual Speech Recognition for 1600+ Languages

Paper • 2511.09690 • Published Nov 12, 2025 • 1

aufklarer
/

Omnilingual-ASR-CTC-1B-MLX-8bit