Running Agents 285 Infinite Dataset Hub ♾ 285 Search and save datasets generated with a LLM in real time
Running Agents 432 Reward Bench Leaderboard 📐 432 Explore and compare model scores on RewardBench benchmarks
Running Agents 232 AI2 WildBench Leaderboard (V2) 🦁 232 Display LLM performance leaderboards with customizable views