Learning from the Self-future: On-policy Self-distillation for dLLMs Paper • 2606.18195 • Published 3 days ago • 70
CroCo: Cross-Lingual Contrastive Preference Tuning on Self-Generations Paper • 2605.26293 • Published 25 days ago • 6
Gamma-World: Generative Multi-Agent World Modeling Beyond Two Players Paper • 2605.28816 • Published 23 days ago • 426
IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools Paper • 2605.20682 • Published about 1 month ago • 84
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 196
OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents Paper • 2605.05185 • Published May 6 • 105
The Geometric Canary: Predicting Steerability and Detecting Drift via Representational Stability Paper • 2604.17698 • Published Apr 20 • 4
Adam's Law: Textual Frequency Law on Large Language Models Paper • 2604.02176 • Published Apr 2 • 507
LatentUM: Unleashing the Potential of Interleaved Cross-Modal Reasoning via a Latent-Space Unified Model Paper • 2604.02097 • Published Apr 2 • 32
ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers Paper • 2603.24414 • Published Mar 25 • 183
CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence Paper • 2603.28032 • Published Mar 30 • 343
VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training Paper • 2602.10693 • Published Feb 11 • 221
Does Your Reasoning Model Implicitly Know When to Stop Thinking? Paper • 2602.08354 • Published Feb 9 • 266