LLM Reference
Researchers
Models
Providers
Benchmarks
BBH
Composite
BIG-Bench Hard
About
Evaluates models on a mix of challenging tasks
Resources
GitHub
arXiv Paper
HuggingFace
Papers With Code