Echo-Memory: A Controlled Study of Memory in Action World Models Paper • 2606.09803 • Published 6 days ago • 32
ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents Paper • 2604.23781 • Published Apr 26 • 33
UniGeo: Unifying Geometric Guidance for Camera-Controllable Image Editing via Video Models Paper • 2604.17565 • Published Apr 19 • 10
RefineAnything: Multimodal Region-Specific Refinement for Perfect Local Details Paper • 2604.06870 • Published Apr 8 • 43
view article Article NEO-unify: Building Native Multimodal Unified Models End to End sensenova • Mar 5 • 164
CoCo: Code as CoT for Text-to-Image Preview and Rare Concept Generation Paper • 2603.08652 • Published Mar 9 • 41
Next-Embedding Prediction Makes Strong Vision Learners Paper • 2512.16922 • Published Dec 18, 2025 • 91
GEBench: Benchmarking Image Generation Models as GUI Environments Paper • 2602.09007 • Published Feb 9 • 39
Does Understanding Inform Generation in Unified Multimodal Models? From Analysis to Path Forward Paper • 2511.20561 • Published Nov 25, 2025 • 33
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published Nov 24, 2025 • 29
CoVT: Chain-of-Visual-Thought Collection Enrich VLMs’ vision-centric reasoning capabilities via Chain-of-Visual-Thought! • 7 items • Updated Nov 25, 2025 • 6
SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models Paper • 2510.12784 • Published Oct 14, 2025 • 20
GIR-Bench: Versatile Benchmark for Generating Images with Reasoning Paper • 2510.11026 • Published Oct 13, 2025 • 18
Reconstruction Alignment Improves Unified Multimodal Models Paper • 2509.07295 • Published Sep 8, 2025 • 40
view article Article Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models AI-MO • Jul 10, 2025 • 56
RecA Collection Unlocking the Massive Zero-shot Potential in Unified Multimodal Models through Self-supervised Learning! • 8 items • Updated Sep 22, 2025 • 14