Text-to-Speech
Chatterbox
Safetensors
Arabic
Saudi
Arabic
Saudi-Dialect
Chatterbox
TTS
voice-cloning
multilingual-tts
Instructions to use NAMAA-Space/NAMAA-Saudi-TTS with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Chatterbox
How to use NAMAA-Space/NAMAA-Saudi-TTS with Chatterbox:
# pip install chatterbox-tts import torchaudio as ta from chatterbox.tts import ChatterboxTTS model = ChatterboxTTS.from_pretrained(device="cuda") text = "Ezreal and Jinx teamed up with Ahri, Yasuo, and Teemo to take down the enemy's Nexus in an epic late-game pentakill." wav = model.generate(text) ta.save("test-1.wav", wav, model.sr) # If you want to synthesize with a different voice, specify the audio prompt AUDIO_PROMPT_PATH="YOUR_FILE.wav" wav = model.generate(text, audio_prompt_path=AUDIO_PROMPT_PATH) ta.save("test-2.wav", wav, model.sr) - Notebooks
- Google Colab
- Kaggle
File size: 4,036 Bytes
d24f326 ca58e2c d24f326 83e7fd6 d24f326 ca58e2c | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 | ---
license: mit
language:
- ar
base_model:
- ResembleAI/chatterbox
pipeline_tag: text-to-speech
tags:
- Saudi
- Arabic
- Saudi-Dialect
- Chatterbox
- TTS
- voice-cloning
- multilingual-tts
library_name: chatterbox
---

# 🇸🇦 NAMAA-Saudi-TTS
**NAMAA-Saudi-TTS** is a Saudi Arabic Text-to-Speech (TTS) model built on top of the **Chatterbox Multilingual TTS** architecture.
The model is configured and refined to generate **natural Saudi dialect speech**, targeting everyday conversational usage rather than Modern Standard Arabic (MSA).
This model is developed and released by **NAMAA Community (Network for Advancing Modern Arabic AI)** as part of its efforts to advance high-quality Arabic speech and language technologies.
---
## 🔊 Live Demo (Hugging Face Space)
👉 **Try the model here:**
https://huggingface.co/spaces/omarelshehy/NAMAA-Saudi-Voice
---
## ✨ Model Capabilities
The model supports:
- **Saudi Arabic text input** (`language_id = "ar"`)
- Natural conversational prosody
- Saudi dialect phrasing and rhythm
- Optional **reference audio prompting** for:
- Speaker similarity
- Style and tone transfer
- GPU-accelerated inference
This repository contains all required **model checkpoints and assets** for local or hosted inference.
---
## 🗣️ Example Text (Saudi Dialect)
```text
آبي أروح البقالة أشتري كم غرض وأرجع بسرعة.
```
## ⚠️ Limitations
Please be aware of the following current limitations:
- Lack of tashkeel may affect pronunciation accuracy.
- Numeric normalization will be improved in future releases.
- This is a known limitation of the current flow-based generation.
These limitations are actively being addressed in upcoming versions.
## 🧪 Example Usage (Inference)
```python
import numpy as np
import torchaudio as ta
from huggingface_hub import snapshot_download
from safetensors.torch import load_file as load_safetensors
from chatterbox import mtl_tts
device = "cuda" # or "cpu" / "mps"
ckpt_dir = snapshot_download(
repo_id="NAMAA-Space/NAMAA-Saudi-TTS",
repo_type="model",
revision="main"
)
# Load model
model = mtl_tts.ChatterboxMultilingualTTS.from_pretrained(device=device)
t3_state = load_safetensors(
f"{ckpt_dir}/t3_mtl23ls_v2.safetensors",
device=device
)
model.t3.load_state_dict(t3_state)
model.t3.to(device).eval()
# Saudi Arabic text
text = "أنا الحين بروح الشغل وإذا رجعت بمرّ البقالة"
wav = model.generate(text, language_id="ar")
ta.save("namma_saudi.wav", wav, model.sr)
```
### 🔹 Inference with Reference Audio (Voice / Style Transfer)
```python
text = "آبي أخلص الشغل اليوم وأرتاح بكرة"
wav = model.generate(
text,
language_id="ar",
audio_prompt_path="/content/reference_saudi.wav"
)
ta.save("namma_saudi_ref.wav", wav, model.sr)
```
## 🧠 Base Model
This model is built on top of:
- **ResembleAI/chatterbox**
- **Chatterbox Multilingual TTS architecture**
The Saudi dialect behavior is achieved through **specialized configuration, prompting, and curated usage patterns**, rather than training focused on Modern Standard Arabic (MSA).
---
## 📜 License
This model is released under the **MIT License**, allowing both **research and commercial usage** with proper attribution.
---
## 🤝 Community & Contributions
Developed and maintained by **NAMAA Community**
*(Network for Advancing Modern Arabic NLP & AI)*
We welcome:
- Feedback and evaluations
- Dialect-specific test cases
- Contributions toward improving Arabic Text-to-Speech systems
---
## 📌 Citation
If you use this model in research or production, please cite:
```bibtex
@misc{namaa_saudi_tts,
title = {NAMAA-Saudi-TTS: Saudi Dialect Text-to-Speech},
author = {{NAMAA Community}},
year = {2026},
url = {https://huggingface.co/NAMAA-Space/NAMAA-Saudi-TTS}
} |