LLM Reference
reviewed this week1715 chat · 25 image · 17 video · 24 voice · 6 music

The 18 LLM leaderboards we'd actually use.

Editor-curated picks for every job — coding, writing, image, voice. One Editor's Choice per board, every pick tagged with the use case it qualified for. Refreshed weekly.

Editorial tiersExcellentStrongSolid

How picks work

A model is eligible for a board only if it's tagged with that use case. Editors pin a handful per board, with exactly one designated Editor's Choice.

Editorial tiers

Each pick is bucketed into one of three qualitative tiers — Excellent · Strong · Solid. No decimals, no composite score, just editorial judgment.

vs. /best composites

Picks are opinionated. /best → is the objective benchmark composite for the same capability.