Automatic Speech Recognition
Transformers
NeMo
Safetensors
PyTorch
parakeet_tdt
feature-extraction
speech
audio
Transducer
Transformer
TDT
FastConformer
Conformer
NeMo
hf-asr-leaderboard
Transformers
Eval Results (legacy)
Eval Results
Instructions to use nvidia/parakeet-tdt-0.6b-v3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use nvidia/parakeet-tdt-0.6b-v3 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("automatic-speech-recognition", model="nvidia/parakeet-tdt-0.6b-v3")# Load model directly from transformers import AutoModelForMultimodalLM model = AutoModelForMultimodalLM.from_pretrained("nvidia/parakeet-tdt-0.6b-v3", dtype="auto") - Inference
- Notebooks
- Google Colab
- Kaggle
How to specify the output language?
#26
by dragonhunterau - opened
It's great that parakeet v3 supports multiple language now, but it randomly generates unexpected characters of other languages when the audio is pure English. is there any parameter or token hint we can use to force it generate token of a particular language?
yeah it seems that there is a good bit of "language cross contamination". On paper it may seems like a good idea to completely ignore language, but in practice it does not seem to work really well . On use cases like dictation in Danish for example I am sometimes getting some Swedish words, and then a few English words and then back to Danish.