LLM ReferenceLLM Reference

DeepSeek V4 Pro

Open Source

About

DeepSeek V4 Pro is the flagship 1.6T parameter (49B activated) Mixture-of-Experts language model with 1M-token context. Features hybrid attention (CSA+HCA) requiring only 27% of inference FLOPs vs DeepSeek-V3.2 at 1M context, Manifold-Constrained Hyper-Connections (mHC), and Muon Optimizer for training stability. Achieves 93.5% on LiveCodeBench, 89.8% on IMOAnswerBench, and 90.1% on MMLU. Supports Non-Think, Think High, and Think Max reasoning modes. Pricing: $1.74/1M input, $3.48/1M output (cache hit: $0.145/1M input). MIT licensed.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2026-04-24
Parameters1.6T
Context1M
ArchitectureMixture of Experts
Specializationgeneral
LicenseMIT
Trainingpretrained

Created by

Advancing artificial general intelligence (AGI).

Hangzhou, Zhejiang, China
Founded 2023
Website