Good Audio Generation space, model, dataset
Good Audio Generation space, model, dataset collection
-
Audio-to-Audio • Updated • 265k • 108 -
KittenML/kitten-tts-nano-0.1
Updated • 18.3k • 515 -
FunAudioLLM/ThinkSound
Video-to-Video • Updated • 54 -
ThinkSound
🔊320Generate audio for a silent video using text prompts
-
Higgs Audio Demo
🎤401Higgs Audio Demo
-
bosonai/higgs-audio-v2-generation-3B-base
Text-to-Speech • 6B • Updated • 127k • 682 -
Hibiki Samples
🤗54Translate speech in real-time with high fidelity
-
kyutai/moshiko-pytorch-bf16
Updated • 118k • 244 -
kyutai/mimi
Feature Extraction • 96.2M • Updated • 1.18M • • 307 -
maya-research/Veena
Text-to-Speech • 4B • Updated • 1.84k • 233 -
MiniMax Speech Tech Report
🎙106Generate natural speech in any voice from text
-
google/magenta-realtime
Updated • 123 • 551 -
PlayDiffusion
🎨120Generate modified audio from text and voice
-
Qwen2.5 Omni 7B Demo
🏆372Chat with text, audio, images, and video, get spoken replies
-
Open ASR Leaderboard
🏆1.37kExplore and compare speech recognition model benchmarks
-
Open NotebookLM
🎙143Generate a podcast to discuss the topic of your choice!
-
Voila Demo
💻44Chat with a voice-clone AI
-
Voice Clone
🗣2.65kClone a voice and generate speech from text
-
moonshotai/Kimi-Audio-7B-Instruct
Text-to-Speech • 10B • Updated • 54.3k • 402 -
moonshotai/Kimi-Audio-7B
Text-to-Speech • 10B • Updated • 90 • 84 -
Dia 1.6B
👯1.78kGenerate realistic dialogue from a script, using Dia!
-
nari-labs/Dia-1.6B
Text-to-Speech • 2B • Updated • 3.6k • • 2.88k -
ByteDance/MegaTTS3
Text-to-Speech • Updated • 71 • 419 -
Di♪♪Rhythm
🎶688Blazingly Fast and Embarrassingly Simple Song Generation
-
Gemini Audio Video
♊35Gemini understands audio and video!
-
nvidia/diar_sortformer_4spk-v1
Automatic Speech Recognition • 0.1B • Updated • 6.03k • 141 -
ACE Step
😻662A Step Towards Music Generation Foundation Model
-
ACE-Step/ACE-Step-v1-3.5B
Text-to-Audio • Updated • 734 -
stepfun-ai/Step-Audio-2-mini
Any-to-Any • 8B • Updated • 2.86k • 259 -
neuphonic/neutts-air
Text-to-Speech • 0.7B • Updated • 22.2k • 873 -
NeuTTS-Air
☁318Clone a voice and generate custom speech
-
KaniTTS
😻114Generate expressive speech from your text in seconds
-
microsoft/UserLM-8b
Text Generation • 8B • Updated • 555 • 377 -
pipecat-ai/smart-turn-v3
Voice Activity Detection • Updated • 174 -
meituan-longcat/LongCat-Audio-Codec
Updated • 42 -
Qwen3 TTS Voice Design
📈113Generate custom speech from text and voice description
-
Qwen TTS Clone Demo
👀64Create a custom voice and synthesize speech from text
-
ResembleAI/chatterbox-turbo
Text-to-Speech • Updated • • 653 -
Chatterbox Turbo Demo
⚡504Chatterbox Turbo Demo
-
zai-org/GLM-TTS
Text-to-Speech • Updated • 236 • 339 -
Qwen/Qwen3-TTS-12Hz-1.7B-CustomVoice
Text-to-Speech • 2B • Updated • 1.48M • 1.61k -
Qwen3-TTS Demo
🎙1.96kGenerate speech from text using voice design, cloning or presets
-
Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice
Text-to-Speech • 0.9B • Updated • 714k • 157 -
FlashLabs/Chroma-4B
Any-to-Any • 6B • Updated • 44 • 382 -
FlashLabs Chroma 1.0: A Real-Time End-to-End Spoken Dialogue Model with Personalized Voice Cloning
Paper • 2601.11141 • Published • 23 -
MOSS Transcribe Diarize: Accurate Transcription with Speaker Diarization
Paper • 2601.01554 • Published • 62 -
FunAudioLLM/Fun-Audio-Chat-8B
Any-to-Any • 9B • Updated • 262 • 184 -
OpenMOSS-Team/MOSS-TTS-Nano-100M
Text-to-Speech • Updated • 111k • 217 -
KittenTTS Demo
😻85Generate natural‑sounding speech from typed text