LLM ReferenceLLM Reference

DeepSeek V4 Flash

Open Source

About

DeepSeek V4 Flash is a 284B parameter (13B activated) Mixture-of-Experts language model with 1M-token context. Features a hybrid attention architecture combining Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA) for efficient long-context inference. Supports thinking and non-thinking modes. Legacy API aliases deepseek-chat and deepseek-reasoner map to this model's non-thinking and thinking modes respectively. Pricing: $0.14/1M input, $0.28/1M output (cache hit: $0.028/1M input). MIT licensed.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

Released2026-04-24
Parameters284B
Context1M
ArchitectureMixture of Experts
Specializationgeneral
LicenseMIT
Trainingpretrained

Created by

Advancing artificial general intelligence (AGI).

Hangzhou, Zhejiang, China
Founded 2023
Website