Hy-MT2: A Family of Fast, Efficient and Powerful Multilingual Translation Models in the Wild Paper • 2605.22064 • Published 3 days ago • 3
Lance: Unified Multimodal Modeling by Multi-Task Synergy Paper • 2605.18678 • Published 6 days ago • 71
DeepVQE: Real Time Deep Voice Quality Enhancement for Joint Acoustic Echo Cancellation, Noise Suppression and Dereverberation Paper • 2306.03177 • Published Jun 5, 2023 • 1
ERNIE-Image Collection The serieas of image generation models, including text2img、img2img. • 4 items • Updated 3 days ago • 23
One-Step Diffusion Transformer for Controllable Real-World Image Super-Resolution Paper • 2511.17138 • Published Nov 21, 2025 • 2
LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale Paper • 2504.16030 • Published Apr 22, 2025 • 37
AuraSR Collection Fastest super resolution model for AI generated images • 2 items • Updated Jul 30, 2024 • 7
VoxCPM: Tokenizer-Free TTS for Context-Aware Speech Generation and True-to-Life Voice Cloning Paper • 2509.24650 • Published Sep 29, 2025 • 11
OmniVoice: Towards Omnilingual Zero-Shot Text-to-Speech with Diffusion Language Models Paper • 2604.00688 • Published Apr 1 • 14
view article Article Speculative Decoding for 2x Faster Whisper Inference sanchit-gandhi • Dec 20, 2023 • 32