LLM Reference
WildBench
Composite

WildBench

About

Evaluates models on a diverse set of tasks