mozilla-foundation/common_voice_17_0
Updated • 5.6k • 25
This is a compact multilingual self-supervised speech encoder based on facebook/hubert-large-ll60k. We performed continued pretraining through multilingual adaptive finetuning (MAFT) on over 10,000 hours of African language data aggregated from various sources.

AfriHuBERT-large covers 1,230 languages in total including 1,226 indigenous African languages
If you use this model in your research paper kindly cite the follows papers:
@doctoralThesis{alabi2026phd,
author = {Alabi, Jesujoba Oluwadara},
title = {Advancing African NLP: Adaptation, Analysis, and Evaluation of Large Language Models},
school = {Saarland University},
year = {2026},
note = {Unpublished doctoral dissertation}
}
@misc{alabi2024afrihubertselfsupervisedspeechrepresentation,
title={AfriHuBERT: A self-supervised speech representation model for African languages},
author={Jesujoba O. Alabi and Xuechen Liu and Dietrich Klakow and Junichi Yamagishi},
year={2024},
eprint={2409.20201},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2409.20201},
}
Base model
facebook/hubert-large-ll60k