Seedance 2.0: Advancing Video Generation for World Complexity Paper • 2604.14148 • Published Apr 15 • 162
Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe Paper • 2604.13016 • Published Apr 14 • 107
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling Paper • 2604.06916 • Published Apr 8 • 34
DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing Paper • 2603.28713 • Published Mar 30 • 22
VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation Paper • 2601.02256 • Published Jan 5 • 33
NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation Paper • 2601.02204 • Published Jan 5 • 63
MathCanvas: Intrinsic Visual Chain-of-Thought for Multimodal Mathematical Reasoning Paper • 2510.14958 • Published Oct 16, 2025 • 23
SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models Paper • 2510.12784 • Published Oct 14, 2025 • 20
view article Article Gemma 3n fully available in the open-source ecosystem! +6 ariG23498, pcuenq, sergiopaniego, reach-vb, FL33TW00D-HF, Xenova, Steveeeeeeen, kashif • Jun 26, 2025 • 121
XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation Paper • 2506.21416 • Published Jun 26, 2025 • 28