Claude Opus 4.6
Claude Opus 4.6 is worth evaluating for coding, rag, and agents when its provider route and context window match the workload.
Use it for
- Teams evaluating coding, rag, and agents
- Workloads that can use a 1m context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Workloads where another current model has stronger sourced task evidence
- Family
- Claude 4.6
- Released
- 2026-02-05
- Context
- 1m
- Architecture
- Decoder Only
- Knowledge cutoff
- 2025-12
- Specialization
- general
- Openness
- Proprietary
- License
- ProprietaryCommercial use: conditional
- Training
- Fine-tuned
Cheapest of 6 routes · Anthropic · cache read $0.500
About
Claude Opus 4.6 is Anthropic's Claude 4.6 model with multimodal text and image input and an optional reasoning mode. It offers a 1M-token context window and scores 80.8 on SWE-bench Verified.
Claude Opus 4.6 is a proprietary model in the Claude 4.6 family. The structured metadata tracks a 1m-token context window, multimodal input, reasoning, function calling, tool use, structured outputs, and code execution. This page tracks provider routes through Anthropic, AWS Bedrock, GCP Vertex AI, and 3 more, with the cheapest tracked route listed at $5 input and $25 output per 1M tokens. Headline tracked benchmarks include SWE-bench Verified 80.8, Google-Proof Q&A 91.3, and SWE-bench Pro 53.4.
Top use-case fit: coding, agents, and build tasks
Coding
Q/$ D5 relevant benchmarks in the decision map.
RAG
Included by capability and metadata signals in the decision map.
Agents
Q/$ D2 relevant benchmarks in the decision map.
Provider price ladder
Compare all 6Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Batch in / out | Cache | Route |
|---|---|---|---|---|---|
| Anthropic | $5.00 | $25.00 | $2.50 / $12.50 | read $0.500 / 5m $6.25 / 1h $10.00 | Serverless |
| Microsoft Foundry | $5.00 | $25.00 | - | read $0.500 / 5m $6.25 / 1h $10.00 | Serverless |
| OpenRouter | $5.00 | $25.00 | - | read $0.500 / 5m $6.25 | Serverless |
| Vercel AI Gateway | $5.00 | $25.00 | - | read $0.500 | Serverless |
Available via routers & gateways(16)
AIRouter
RouterCommercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.
Amazon Bedrock Intelligent Prompt Routing
RouterAWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.
Azure AI Foundry Model Router
RouterMicrosoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.
Helicone
GatewayObservability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.
Kong AI Gateway
GatewayMulti-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.
LiteLLM
GatewayOpen-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.
Capabilities
Benchmark peer barsfor Coding
Benchmark scores(20)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| SWE-bench Verified | 80.8 | 2026-02 | https://www.anthropic.com/news/claude-4 |
| Google-Proof Q&A | 91.3 | diamond | https://www.anthropic.com/claude/opus |
| SWE-bench Pro | 53.4 | — | https://anthropic.com/glasswing |
| MMLU PRO | 89.1 | — | https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro |
| τ-bench | 84.8 | τ-bench | https://benchlm.ai/benchmarks/tauBench |
| Chatbot Arena | 1501.0 | Thinking | https://arena.ai/leaderboard |
| MMMU Pro | 77.3 | official Anthropic system card, adaptive thinking, max effort, with image cropping tool, avg 5 runs | https://www.anthropic.com/claude-opus-4-6-system-card |
| SWE-rebench | 65.3 | pass@1 (best of 5 runs) | https://swe-rebench.com/leaderboard |
| Aider Polyglot | 72.0 | Listed as 'Claude Opus 4 (32k thinking)' = 72 (percent_correct) | https://aider.chat/docs/leaderboards/ |
| AIME 2025 | 94.2 | AIME 2025 (accuracy) | https://automatio.ai/models/claude-opus-4-6 |
| ARC-AGI-2 | 68.8 | ARC-AGI-2 (accuracy%) | https://llm-stats.com/benchmarks/arc-agi-v2 |
| BrowseComp | 84.0 | BrowseComp (accuracy) | https://www.vellum.ai/blog/claude-opus-4-6-benchmarks |
| Humanity's Last Exam | 53.0 | HLE with tools enabled (accuracy) | https://www.anthropic.com/news/claude-opus-4-6 |
| HumanEval | 95.0 | HumanEval (pass@1) | https://automatio.ai/models/claude-opus-4-6 |
| LiveCodeBench | 70.2 | LiveCodeBench score (accuracy) | https://automatio.ai/models/claude-opus-4-6 |
| MathVista | 74.8 | MathVista accuracy (accuracy) | https://automatio.ai/models/claude-opus-4-6 |
| MCP-Atlas | 62.7 | MCP-Atlas (accuracy%) | https://llm-stats.com/benchmarks/mcp-atlas |
| Massive Multitask Language Understanding | 91.1 | MMLU (accuracy) | https://automatio.ai/models/claude-opus-4-6 |
| Massive Multi-discipline Multimodal Understanding | 76.5 | MMMU (accuracy) | https://automatio.ai/models/claude-opus-4-6 |
| Terminal-Bench 2.0 | 65.4 | Terminal-Bench 2.0 (accuracy%) | https://llm-stats.com/benchmarks/terminal-bench-2 |
Migration checks
Rankings & picks(10)
Comparison and alternatives
Browse all comparisons →Show all 74 popular comparisonssorted by 7-day search impressions
Cheapest of 6 routes · Anthropic · cache read $0.500