DeepSeek V4 Pro
Open Source
About
DeepSeek V4 Pro is the flagship 1.6T parameter (49B activated) Mixture-of-Experts language model with 1M-token context. Features hybrid attention (CSA+HCA) requiring only 27% of inference FLOPs vs DeepSeek-V3.2 at 1M context, Manifold-Constrained Hyper-Connections (mHC), and Muon Optimizer for training stability. Achieves 93.5% on LiveCodeBench, 89.8% on IMOAnswerBench, and 90.1% on MMLU. Supports Non-Think, Think High, and Think Max reasoning modes. Pricing: $1.74/1M input, $3.48/1M output (cache hit: $0.145/1M input). MIT licensed.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution
Specifications
FamilyDeepSeek V4
Released2026-04-24
Parameters1.6T
Context1M
ArchitectureMixture of Experts
Specializationgeneral
LicenseMIT
Trainingpretrained
Created by
Advancing artificial general intelligence (AGI).
Hangzhou, Zhejiang, China
Founded 2023
Website