CHI-Bench: Can AI Agents Automate End-to-End, Long-Horizon, Policy-Rich Healthcare Workflows? Paper • 2605.16679 • Published 10 days ago • 51
Webscale-RL: Automated Data Pipeline for Scaling RL Data to Pretraining Levels Paper • 2510.06499 • Published Oct 7, 2025 • 33
Reinforcing Multi-Turn Reasoning in LLM Agents via Turn-Level Credit Assignment Paper • 2505.11821 • Published May 17, 2025 • 14