-
One Shot, One Talk: Whole-body Talking Avatar from a Single Image
Paper • 2412.01106 • Published • 24 -
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
Paper • 2412.04448 • Published • 10 -
IDOL: Instant Photorealistic 3D Human Creation from a Single Image
Paper • 2412.14963 • Published • 6 -
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Paper • 2502.01061 • Published • 225
Collections
Discover the best community collections!
Collections including paper arxiv:2503.23307
-
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Paper • 2504.00999 • Published • 97 -
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Paper • 2503.24379 • Published • 76 -
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Paper • 2503.24376 • Published • 38 -
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Paper • 2503.21614 • Published • 43
-
One-Minute Video Generation with Test-Time Training
Paper • 2504.05298 • Published • 110 -
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper • 2503.23307 • Published • 141 -
Towards Understanding Camera Motions in Any Video
Paper • 2504.15376 • Published • 157 -
Antidistillation Sampling
Paper • 2504.13146 • Published • 60
-
One Shot, One Talk: Whole-body Talking Avatar from a Single Image
Paper • 2412.01106 • Published • 24 -
MEMO: Memory-Guided Diffusion for Expressive Talking Video Generation
Paper • 2412.04448 • Published • 10 -
IDOL: Instant Photorealistic 3D Human Creation from a Single Image
Paper • 2412.14963 • Published • 6 -
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models
Paper • 2502.01061 • Published • 225
-
One-Minute Video Generation with Test-Time Training
Paper • 2504.05298 • Published • 110 -
MoCha: Towards Movie-Grade Talking Character Synthesis
Paper • 2503.23307 • Published • 141 -
Towards Understanding Camera Motions in Any Video
Paper • 2504.15376 • Published • 157 -
Antidistillation Sampling
Paper • 2504.13146 • Published • 60
-
MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization
Paper • 2504.00999 • Published • 97 -
Any2Caption:Interpreting Any Condition to Caption for Controllable Video Generation
Paper • 2503.24379 • Published • 76 -
Exploring the Effect of Reinforcement Learning on Video Understanding: Insights from SEED-Bench-R1
Paper • 2503.24376 • Published • 38 -
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Paper • 2503.21614 • Published • 43