Text-to-Speech
Transformers
Safetensors
GGUF
llama
text-generation
speech-synthesis
multilingual
indic
orpheus
lora
low-latency
zero-shot
emotions
discrete-audio-tokens
text-generation-inference
Instructions to use kenpath/svara-tts-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use kenpath/svara-tts-v1 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-to-speech", model="kenpath/svara-tts-v1")# Load model directly from transformers import AutoTokenizer, AutoModelForMultimodalLM tokenizer = AutoTokenizer.from_pretrained("kenpath/svara-tts-v1") model = AutoModelForMultimodalLM.from_pretrained("kenpath/svara-tts-v1") - Notebooks
- Google Colab
- Kaggle
Will there be a 48kHz model?
#3
by utkarsh22990 - opened
Hi Team,
I just wanted to know, if there will be a 48kHz model or is there any way we can train or FT on 48kHz data? It'd be helpful. Thank you.
Not planning on training a 48 kHz model. If you have 48 kHz audio data that you need to train on, you can downsample to 24 kHz for SNAC, quality loss is minimal for TTS.
Hi Aditya, Thanks for the response and wonderful repo. But I do need a 48kHz model for better quality for my use case. I have the compute, could you suggest me If I can train original Orpheus TTS that outputs 48kHz ( and any major changes that I'd need), that'd be very grateful of you. Thank you again.