| --- |
| license: mit |
| base_model: vibevoice/VibeVoice-7B |
| tags: |
| - tts |
| - text-to-speech |
| - speech-synthesis |
| - norwegian |
| - bokmal |
| language: |
| - "no" |
| - nb |
| --- |
| |
| # Prat-9B (preview) |
|
|
| A Norwegian (Bokmal) text-to-speech model fine-tuned for the Østnorsk/Oslo dialect. |
| This model is currently in preview, You can expect things like weird artefacts, |
| But generally, per our testing, it outperforms VibeVoice 7B per our unscientific qualitative eval. |
|
|
| ## Usage |
|
|
| ```python |
| from transformers import AutoProcessor, AutoModel |
| import torch |
| |
| processor = AutoProcessor.from_pretrained("heiertech/Prat-9B") |
| model = AutoModel.from_pretrained("heiertech/Prat-9B", torch_dtype=torch.bfloat16) |
| |
| # Generate speech |
| text = "Hei, dette er en test av den norske stemmen." |
| inputs = processor(text=text, return_tensors="pt") |
| outputs = model.generate(**inputs) |
| ``` |
|
|
| ## Base Model |
|
|
| This model is based on [VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B). |
| Note that despite the name, VibeVoice-7B is actually a 9B parameter model. |
| The 7B only refers to the size of the llm backbone based on Qwen2.5 7B |
|
|
| ## Acknowledgments |
|
|
| - Base model: [vibevoice/VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B) |
| - Training data: Mozilla Common Voice Norwegian |
|
|