Where Did It Go Wrong? Process-Level Evaluation of Web Agents with Semantic State Tracking Paper • 2606.15673 • Published Apr 8 • 11
SpatialClaw: Rethinking Action Interface for Agentic Spatial Reasoning Paper • 2606.13673 • Published 8 days ago • 99
Rethinking State Tracking in Recurrent Models Through Error Control Dynamics Paper • 2605.07755 • Published May 8 • 24
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues Paper • 2506.00958 • Published Jun 1, 2025 • 20
Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation Paper • 2505.18842 • Published May 24, 2025 • 36
VisEscape: A Benchmark for Evaluating Exploration-driven Decision-making in Virtual Escape Rooms Paper • 2503.14427 • Published Mar 18, 2025 • 19
Teaching Metric Distance to Autoregressive Multimodal Foundational Models Paper • 2503.02379 • Published Mar 4, 2025 • 4