view article Article Is using a validation set useful for end-to-end learning in robotics? m1b • Dec 1, 2024 • 16
Qwen-VLA: Unifying Vision-Language-Action Modeling across Tasks, Environments, and Robot Embodiments Paper • 2605.30280 • Published 28 days ago • 146
UI-Venus Technical Report: Building High-performance UI Agents with RFT Paper • 2508.10833 • Published Aug 14, 2025 • 46
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 310
UI Agent Collection a collection of algorithmic agents for user interfaces/interactions, program synthesis, and robotics • 497 items • Updated 2 days ago • 69
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning Paper • 2409.20566 • Published Sep 30, 2024 • 54
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Paper • 2408.06327 • Published Aug 12, 2024 • 17
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Paper • 2408.06327 • Published Aug 12, 2024 • 17
ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools Paper • 2406.12793 • Published Jun 18, 2024 • 35