This model was created for the On Top of Pasketti: Children’s Speech Recognition Challenge - Word Track competition. It was trained on a large-scale dataset specifically designed for children's speech recognition.

Model is based on nvidia/parakeet-tdt-0.6b-v2.

  • Local validation WER: 0.1055
  • Private Leaderboard WER: (unknown)

Usage:

pip install -U nemo_toolkit["asr"]
wget https://github.com/drivendataorg/childrens-speech-recognition-runtime/raw/refs/heads/main/data-demo/phonetic/audio/U_1c8757065e355c35.flac
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained(model_name="ZFTurbo/parakeet-tdt-0.6b-v2-Children-Words")
output = asr_model.transcribe(['U_1c8757065e355c35.flac'])
print(output[0].text)

More usage examples: https://github.com/ZFTurbo/Children-Speech-Recognition-Challenge-Solution

Downloads last month
35
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ZFTurbo/parakeet-tdt-0.6b-v2-Children-Words

Finetuned
(31)
this model