DeepSeek V4 Pro
DeepSeek V4 Pro is worth evaluating for coding, rag, and agents when its provider route and context window match the workload.
Use it for
- Teams evaluating coding, rag, and agents
- Workloads that can use a 1m context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Family
- DeepSeek V4
- Released
- 2026-04-24
- Context
- 1m
- Max output
- 384,000
- Parameters
- 1.6T
- Architecture
- Mixture of Experts (MoE) with CSA+HCA hybrid attention
- Specialization
- general
- Openness
- Open source
- License
- MIT(OSI)Commercial use allowed
- Training
- pretrained
Cheapest of 5 routes · DeepSeek Platform · cache read $0.0036
About
DeepSeek V4 Pro is DeepSeek's flagship open-weights model, released April 24 2026 under the MIT license. Architecture: 1.6T total / 49B active parameters, MoE with Compressed Sparse Attention (CSA) + Heavily Compressed Attention (HCA) hybrid — requiring only 27% of inference FLOPs vs standard 1M-context transformers — plus Manifold-Constrained Hyper-Connections (mHC) and Muon Optimizer. Context window: 1,000,000 tokens; max output: 384,000 tokens (Think Max mode requires ≥384K context). Text-only (no vision/image input). Supports three reasoning modes: Non-Think, Think High, Think Max. Function calling, tool use, and structured outputs supported. Key benchmarks: SWE-bench Verified 80.6%, SWE-bench Pro 55.4%, LiveCodeBench 93.5%, GPQA Diamond 90.1%, MMLU-Pro 87.5%, Terminal-Bench 2.0 67.9%, Chatbot Arena 1460 (2026-04-28). Current API pricing: $0.435/$0.87 per 1M input/output tokens; DeepSeek made the former 75% promotional rate permanent effective 2026-05-31 15:59 UTC.
DeepSeek V4 Pro is an open-source model in the DeepSeek V4 family. The structured metadata tracks a 1m-token context window, reasoning, function calling, tool use, and structured outputs. This page tracks provider routes through DeepSeek Platform, Fireworks AI, OpenRouter, and 2 more, with the cheapest tracked route listed at $0.435 input and $0.87 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 90.1, Massive Multitask Language Understanding 90.1, and MMLU PRO 87.5.
Top use-case fit: coding, agents, and build tasks
Coding
Q/$ C4 relevant benchmarks in the decision map.
RAG
Included by capability and metadata signals in the decision map.
Agents
Q/$ B1 relevant benchmark in the decision map.
Provider price ladder
Compare all 5Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Cache | Route |
|---|---|---|---|---|
| DeepSeek Platform | $0.435 | $0.870 | read $0.0036 | Serverless |
| Vercel AI Gateway | $0.435 | $0.870 | read $0.0036 | Serverless |
| OpenRouter | $0.440 | $0.870 | - | Serverless |
| Novita AI | $1.64 | $3.38 | - | Serverless |
Available via routers & gateways(4)
LiteLLM
GatewayOpen-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.
OpenRouter
HybridUnified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.
Requesty
HybridAI gateway to 400+ LLM providers with intelligent routing, caching, guardrails, and governance; flat 5% markup on model costs with no subscription fee.
Respan
HybridUnified LLM engineering platform (gateway + observability + evals + prompt management) routing across 250+ models; previously Keywords AI, rebranded February 2026.
Capabilities
Benchmark peer barsfor Coding
Benchmark scores(14)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 90.1 | diamond | https://www.datacamp.com/blog/deepseek-v4 |
| Massive Multitask Language Understanding | 90.1 | 5-shot | https://api-docs.deepseek.com/news/news260424 |
| MMLU PRO | 87.5 | Think Max mode (accuracy) | https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash |
| SWE-bench Verified | 80.6 | SWE-bench Verified | https://www.swebench.com/verified.html |
| Chatbot Arena | 1460.0 | — | https://arena.ai/leaderboard |
| LiveCodeBench | 93.5 | Think Max mode (pass@1) | https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash |
| SWE-bench Pro | 55.4 | — | https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro |
| HumanEval | 76.8 | Pass@1 | https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro |
| Mathematics Aptitude Test of Heuristics | 64.5 | — | https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro |
| Terminal-Bench 2.0 | 67.9 | Terminal-Bench 2.0 (accuracy%) | https://benchlm.ai/benchmarks/terminalBench2 |
| SWE-bench Multilingual | 76.2 | — | https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro |
| BrowseComp | 83.4 | BrowseComp (accuracy%) | https://benchlm.ai/benchmarks/browseComp |
| Humanity's Last Exam | 37.7 | HLE (accuracy) | https://www.morphllm.com/deepseek-v4 |
| MCP-Atlas | 73.6 | MCP-Atlas (accuracy%) | https://benchlm.ai/benchmarks/mcpAtlas |
Migration checks
No linked migration route is available for this model yet.
Rankings & picks(10)
Comparison and alternatives
Browse all comparisons →Show all 73 popular comparisonssorted by 7-day search impressions
Cheapest of 5 routes · DeepSeek Platform · cache read $0.0036