heiertech
/

Prat-9B

Norwegian Bokmål

speech-synthesis

Model card Files Files and versions

Prat-9B / README.md

marksverdhei's picture

Update README.md

eb91083 verified 5 months ago

|

History Blame Contribute Delete

1.23 kB

	---
	license: mit
	base_model: vibevoice/VibeVoice-7B
	tags:
	- tts
	- text-to-speech
	- speech-synthesis
	- norwegian
	- bokmal
	language:
	- "no"
	- nb
	---

	# Prat-9B (preview)

	A Norwegian (Bokmal) text-to-speech model fine-tuned for the Østnorsk/Oslo dialect.
	This model is currently in preview, You can expect things like weird artefacts,
	But generally, per our testing, it outperforms VibeVoice 7B per our unscientific qualitative eval.

	## Usage

	```python
	from transformers import AutoProcessor, AutoModel
	import torch

	processor = AutoProcessor.from_pretrained("heiertech/Prat-9B")
	model = AutoModel.from_pretrained("heiertech/Prat-9B", torch_dtype=torch.bfloat16)

	# Generate speech
	text = "Hei, dette er en test av den norske stemmen."
	inputs = processor(text=text, return_tensors="pt")
	outputs = model.generate(**inputs)
	```

	## Base Model

	This model is based on [VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B).
	Note that despite the name, VibeVoice-7B is actually a 9B parameter model.
	The 7B only refers to the size of the llm backbone based on Qwen2.5 7B

	## Acknowledgments

	- Base model: [vibevoice/VibeVoice-7B](https://huggingface.co/vibevoice/VibeVoice-7B)
	- Training data: Mozilla Common Voice Norwegian