Cut Your Losses! Learning to Prune Paths Early for Efficient Parallel Reasoning Paper • 2604.16029 • Published Apr 17 • 23
OneVL: One-Step Latent Reasoning and Planning with Vision-Language Explanation Paper • 2604.18486 • Published Apr 20 • 94
ClawEnvKit: Automatic Environment Generation for Claw-Like Agents Paper • 2604.18543 • Published Apr 20 • 30
Stabilizing Efficient Reasoning with Step-Level Advantage Selection Paper • 2604.24003 • Published 26 days ago • 8
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published 11 days ago • 189