LLM ReferenceLLM Reference
GSM8KactiveMathematics

Grade School Math 8K

Metric: Accuracy (higher is better)Introduced: 2021

About

8,500 grade school math word problems requiring 2–8 steps of arithmetic reasoning. Widely used for evaluating chain-of-thought reasoning; top models now near 100%.