AI2 WildBench Leaderboard (V2)
π¦
232
Display LLM performance leaderboards with customizable views
Display LLM performance leaderboards with customizable views
View the LMArena leaderboard in fullβscreen
Track, rank and evaluate open LLMs and chatbots
Embedding Leaderboard
Compare LLM hardware performance and find the best model
Explore and compare code model performance on a leaderboard
Explore and compare speech recognition model benchmarks
Explore and compare model scores on RewardBench benchmarks
Jailbreak the LLM and privacy guardrails
View the Berkeley Function-Calling Leaderboard