kiarashQ/farsi-asr-unified-cleaned
Viewer • Updated • 1.28M • 1.5k • 4
Model name: fa-ir-tts-piper-en-mantatts-v1
Previous name: kiarashQ/fa_IR-mantatts
Sampling rate: 22,050 Hz
Base checkpoint:ar/ar_JO/kareem/medium/epoch=5079-step=1682020.ckpt (Piper AR, medium)
This is a Persian (fa-IR) single-speaker TTS model fine-tuned from the Arabic Piper medium checkpoint on the ManaTTS dataset.
Training script: piper_train
Hardware: 1× GPU A4000
Dataset: ManaTTS
Batch size: 16
Precision: 32-bit
Validation split: 1%
Test samples: 5
Training epochs: 20
Logging: every 2000 steps
Quality setting: medium
Checkpoint frequency: every 1 epoch
No resume checkpoint (fresh fine-tune)
Training command:
piper_train \
--dataset-dir /workspace/piper_full/piper_dataset \
--accelerator gpu --devices 1 \
--batch-size 16 \
--validation-split 0.01 \
--num-test-examples 5 \
--quality medium \
--checkpoint-epochs 1 \
--max_epochs 20 \
--precision 32 \
--log_every_n_steps 2000
piper \
--model model.onnx \
--config config.json \
--text "سلام! حال شما چطور است؟" \
--output_file out.wav
Python:
import subprocess
text = "سلام! امروز هوا چطور است؟"
subprocess.run([
"piper", "--model", "model.onnx", "--config", "config.json",
"--text", text, "--output_file", "out.wav"
])
Apache-2.0
Base model
rhasspy/piper-voices