LLM Reference
BBH
Composite

BIG-Bench Hard

About

Evaluates models on a mix of challenging tasks