All comparisons
HumanEval 96.7 93.1 SWE-bench Verified 71.7 63.2 LiveCodeBench 79.1 70.4 Aider Polyglot 81.3 — Chatbot Arena 1412.0 1398.0 Google-Proof Q&A 87.7 86.4 Massive Multi-discipline Multimodal Understanding 82.9 — MMLU PRO — 86.2
o3 vs Gemini 2.5 Pro
Side-by-side comparison of specifications, capabilities, and pricing.
| Released | 2025-03-31 | 2025-06-17 |
| Context window | 128K | 1M |
| Parameters | — | — |
| Architecture | decoder only | decoder only |
| License | Unknown | Proprietary |
| Knowledge cutoff | — | 2025-01 |
Capabilities | ||
| Vision | ||
| Multimodal | ||
| Reasoning | ||
| Function calling | ||
| Tool use | ||
| Structured Outputs | ||
| Code execution | ||
Availability | ||
| Providers | ||
Benchmarks