Question 1

What does the BBQ Ambig benchmark measure?

Accepted Answer

Ambiguous context subset of the Bias Benchmark for QA (BBQ), measuring how models respond to social bias questions when context is underspecified. On this page it lists 0 tracked model variants where higher is better.

Question 2

Is a higher BBQ Ambig score always better?

Accepted Answer

For this benchmark, higher is better. A high score helps you shortlist, but confirm pricing, context window, and provider availability on each model page before committing — the top scorer is not always the right pick for your workload or budget.

Question 3

How current is this BBQ Ambig data?

Accepted Answer

This benchmark was last reviewed on Apr 15, 2026. Re-check the linked model pages for the freshest provider and pricing detail.

BBQ Ambig

Leaderboard

How to read this benchmark

FAQ

Related benchmarks

Resources