Robust-U1: Can MLLMs Self-Recover Corrupted Visual Content for Robust Understanding? Paper • 2606.08063 • Published 14 days ago • 78
Claw-SWE-Bench: A Benchmark for Evaluating OpenClaw-style Agent Harnesses on Coding Tasks Paper • 2606.12344 • Published 10 days ago • 66
CoHyDE: Iterative Co-Training of LLM Rewriter & Dense Encoder for Tool Retrieval Paper • 2605.29271 • Published 23 days ago • 9
From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper • 2605.31042 • Published 22 days ago • 18
Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality? Paper • 2605.22109 • Published 30 days ago • 170