LLM Reference

The long context leaderboard · for developers

Best for long context

4 editor picks · 6 eligible models · Past 200K tokens without falling apart.

See raw /best
EDITOR'S CHOICEResearched 141d ago

Gemini 3 Pro

Google DeepMind · 1M context
Excellent

The models still credible when the window is genuinely full.

The long-context standard — 1M tokens with cheap input that keeps the bill survivable.

The numbers
$/1M out
$5.00
$1.25 input
Context
1M
max window
Pros
  • +1M context, strong recall
  • +Cheap input tokens
  • +Mature tooling
Cons
  • Not the strongest pure reasoner

Also worth picking

The runners-up

ranked by editorial pick order
Editorial tiersExcellentStrongSolid
#ModelTier$/1M outEditor's note
#2
Anthropic · 1M
$25.00
Catches subtle cross-references across a full 1M window better than anyone; pricey but precise.
#3
OpenAI · 1M
$15.00
Same 1.05M window as GPT-5.5 at half the output price ($15 vs $30) — the value choice for long analytical passes.
#4
DeepSeek · 1M
$0.22
1M context at $0.22 out, open weights — the value pick for huge inputs.

Eligibility

6 models are eligible for this board

Eligibility means tagged with useCases: [long-context]. Pins must come from this pool.

All picks