InstanceControl: Controllable Complex Image Generation without Instance Labeling Paper • 2606.31924 • Published 5 days ago • 8
DataClaw0: Agentic Tailoring Multimodal Data from Raw Streams Paper • 2606.21337 • Published 16 days ago • 74
VideoMLA: Low-Rank Latent KV Cache for Minute-Scale Autoregressive Video Diffusion Paper • 2605.30351 • Published May 28 • 26
GameWorld: Towards Standardized and Verifiable Evaluation of Multimodal Game Agents Paper • 2604.07429 • Published Apr 8 • 123
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 103
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183
ShotStream: Streaming Multi-Shot Video Generation for Interactive Storytelling Paper • 2603.25746 • Published Mar 26 • 155
SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models Paper • 2603.16859 • Published Mar 17 • 248