LLM Reference

DeepSeek V4 Pro

Released
2026-04-24
Last refreshed
2026-05-31
Status
Researched 8d ago
Open SourceCommercial use allowedCodingRAGAgentsLong contextClassificationJSON / Tool use

DeepSeek V4 Pro is worth evaluating for coding, rag, and agents when its provider route and context window match the workload.

Use it for

  • Teams evaluating coding, rag, and agents
  • Workloads that can use a 1m context window
  • Buyers comparing 4 tracked provider routes

Do not use it for

  • Vision or document-understanding workloads
Specifications
Released
2026-04-24
Context
1m
Max output
384,000
Parameters
1.6T
Architecture
Mixture of Experts (MoE) with CSA+HCA hybrid attention
Specialization
general
Openness
Open source
License
MIT(OSI)Commercial use allowed
Training
pretrained
Created by

Advancing artificial general intelligence (AGI).

Hangzhou, Zhejiang, China
Founded 2023
Website
Pricing
Output / 1M
$0.870
Input / 1M
$0.435

Cheapest of 5 routes · DeepSeek Platform · cache read $0.0036

About

DeepSeek V4 Pro is DeepSeek's flagship open-weights model, released April 24 2026 under the MIT license. Architecture: 1.6T total / 49B active parameters, MoE with Compressed Sparse Attention (CSA) + Heavily Compressed Attention (HCA) hybrid — requiring only 27% of inference FLOPs vs standard 1M-context transformers — plus Manifold-Constrained Hyper-Connections (mHC) and Muon Optimizer. Context window: 1,000,000 tokens; max output: 384,000 tokens (Think Max mode requires ≥384K context). Text-only (no vision/image input). Supports three reasoning modes: Non-Think, Think High, Think Max. Function calling, tool use, and structured outputs supported. Key benchmarks: SWE-bench Verified 80.6%, SWE-bench Pro 55.4%, LiveCodeBench 93.5%, GPQA Diamond 90.1%, MMLU-Pro 87.5%, Terminal-Bench 2.0 67.9%, Chatbot Arena 1460 (2026-04-28). Current API pricing: $0.435/$0.87 per 1M input/output tokens; DeepSeek made the former 75% promotional rate permanent effective 2026-05-31 15:59 UTC.

DeepSeek V4 Pro is an open-source model in the DeepSeek V4 family. The structured metadata tracks a 1m-token context window, reasoning, function calling, tool use, and structured outputs. This page tracks provider routes through DeepSeek Platform, Fireworks AI, OpenRouter, and 2 more, with the cheapest tracked route listed at $0.435 input and $0.87 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 90.1, Massive Multitask Language Understanding 90.1, and MMLU PRO 87.5.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ C

4 relevant benchmarks in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Agents

Q/$ B

1 relevant benchmark in the decision map.

Capabilities

ReasoningFunction CallingTool UseStructured OutputsPrompt Caching

Benchmark peer barsfor Coding

Benchmark scores(14)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.
BenchmarkScoreVersionSource
Google-Proof Q&A90.1diamondhttps://www.datacamp.com/blog/deepseek-v4
Massive Multitask Language Understanding90.15-shothttps://api-docs.deepseek.com/news/news260424
MMLU PRO87.5Think Max mode (accuracy)https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash
SWE-bench Verified80.6SWE-bench Verifiedhttps://www.swebench.com/verified.html
Chatbot Arena1460.0https://arena.ai/leaderboard
LiveCodeBench93.5Think Max mode (pass@1)https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash
SWE-bench Pro55.4https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
HumanEval76.8Pass@1https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
Mathematics Aptitude Test of Heuristics64.5https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
Terminal-Bench 2.067.9Terminal-Bench 2.0 (accuracy%)https://benchlm.ai/benchmarks/terminalBench2
SWE-bench Multilingual76.2https://huggingface.co/deepseek-ai/DeepSeek-V4-Pro
BrowseComp83.4BrowseComp (accuracy%)https://benchlm.ai/benchmarks/browseComp
Humanity's Last Exam37.7HLE (accuracy)https://www.morphllm.com/deepseek-v4
MCP-Atlas73.6MCP-Atlas (accuracy%)https://benchlm.ai/benchmarks/mcpAtlas

Migration checks

No linked migration route is available for this model yet.

Show all 73 popular comparisonssorted by 7-day search impressions
DeepSeek V4 Pro vs Kimi K2.610KDeepSeek V4 Pro vs DeepSeek V4 Flash4KDeepSeek V4 Pro vs Gemini 2.5 Flash2KDeepSeek V4 Pro vs Claude Sonnet 4.52KDeepSeek V4 Pro vs Gemini 2.5 Pro2KDeepSeek V4 Pro vs Qwen3.6-27B1KDeepSeek V4 Pro vs Grok-31KDeepSeek V4 Pro vs Kimi K2.51KDeepSeek V4 Pro vs Claude Opus 4.61KDeepSeek V4 Pro vs Composer 2.51KDeepSeek V4 Pro vs Qwen3.6-35B-A3B1KDeepSeek V4 Pro vs GPT-5.51KDeepSeek V4 Pro vs DeepSeek R1 Lite903DeepSeek V4 Pro vs GLM-5872DeepSeek V4 Pro vs Gemini 3.5 Flash842DeepSeek V4 Pro vs Qwen3.5-397B-A17B783DeepSeek V4 Pro vs GPT-5.5 Pro677DeepSeek V4 Pro vs GPT-5.4426DeepSeek V4 Pro vs GPT-4o (2024-11-20)404DeepSeek V4 Pro vs Claude Mythos Preview360DeepSeek V4 Pro vs Claude 3.7 Sonnet357DeepSeek V4 Pro vs DeepSeek V3328DeepSeek V4 Pro vs Claude Opus 4.5316DeepSeek V4 Pro vs Qwen3.5-27B308DeepSeek V4 Pro vs Grok Build 0.1308DeepSeek V4 Pro vs GPT-5.5-Cyber259DeepSeek V4 Pro vs GPT-5.2 Codex252DeepSeek V4 Pro vs DeepSeek R1 0528238DeepSeek V4 Pro vs Llama 3 70B Instruct226DeepSeek V4 Pro vs Qwen2.5-72B-Instruct221DeepSeek V4 Pro vs Grok 3 Mini221DeepSeek V4 Pro vs Trinity-Large-Thinking178DeepSeek V4 Pro vs GLM-5V-Turbo168DeepSeek V4 Pro vs o3148DeepSeek V4 Pro vs Mistral Large 2147DeepSeek V4 Pro vs Llama 3 8B Instruct134DeepSeek V4 Pro vs Llama 3.1 70B Instruct133DeepSeek V4 Pro vs Qwen3-105B127DeepSeek V4 Pro vs Qwen3.5-35B-A3B120DeepSeek V4 Pro vs Llama 3.2 1B Instruct85DeepSeek V4 Pro vs Qwen3-235B-A22B85DeepSeek V4 Pro vs Gemma 7B Instruct83DeepSeek V4 Pro vs Qwen2.5-72B82DeepSeek V4 Pro vs Composer 275DeepSeek V4 Pro vs o3-pro66DeepSeek V4 Pro vs Kimi K2 Thinking Turbo62DeepSeek V4 Pro vs GPT-4o-mini Search Preview60DeepSeek V4 Pro vs Qwen3.5-122B-A10B48DeepSeek V4 Pro vs Qwen3.5-9B46DeepSeek V4 Pro vs Phi-3 Mini 4k45DeepSeek V4 Pro vs Llama 2 13B Chat36DeepSeek V4 Pro vs Gemini 2.5 Flash Live API34DeepSeek V4 Pro vs Together AI Qwen2-7B-Instruct32DeepSeek V4 Pro vs o3 Mini28DeepSeek V4 Pro vs Together AI Qwen2-72B-Instruct24DeepSeek V4 Pro vs Together AI - Llama 3 8B Lite24DeepSeek V4 Pro vs Llama 3.1 405B23DeepSeek V4 Pro vs Mistral Nemotron22DeepSeek V4 Pro vs Phi-4 Mini Flash Reasoning18DeepSeek V4 Pro vs o3 Deep Research17DeepSeek V4 Pro vs Gemma 2 2B17DeepSeek V4 Pro vs Mixtral 8x7B14DeepSeek V4 Pro vs Trinity-Large-Preview14DeepSeek V4 Pro vs Gemma 2B Instruct9DeepSeek V4 Pro vs Qwen2-7B-Instruct9DeepSeek V4 Pro vs Llama 3.1 405B Instruct9DeepSeek V4 Pro vs Code Davinci 0017DeepSeek V4 Pro vs Phi-4 Reasoning Vision 15B5DeepSeek V4 Pro vs ShieldGemma 9B5DeepSeek V4 Pro vs Gemini 3.1 Flash-Lite4DeepSeek V4 Pro vs GPT-5.4-Cyber3DeepSeek V4 Pro vs Mixtral 8x22B v0.13DeepSeek V4 Pro vs Gemma 2 9B SahabatAI Instruct1