What is DeepSeek V2 used for?

DeepSeek V2 is used for agent workflows and tool use, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.

How does DeepSeek V2 compare to Janus?

DeepSeek V2 by DeepSeek is strongest where you need agent workflows and tool use, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek V2 has 6 listed variants and reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.

Which DeepSeek V2 model should I use?

For the lowest listed input price, start with DeepSeek V2 through DeepSeek Platform at $0.14/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek V2.5 with 128k context and function calling.

DeepSeek V2 Models by DeepSeek

DeepSeekHighlight

6 models2024Up to 128k ctxFrom $0.14/1M input

About

The DeepSeek V2 family offers an array of large language models (LLMs) noted for their economic scalability and efficacy in inference. Its flagship, DeepSeek V2, features 236 billion parameters with 21 billion activated per token, allowing a substantial context length of 128,000 tokens 15. Leveraging advanced architectures such as Multi-head Latent Attention (MLA) and DeepSeekMoE, it achieves significant efficiency by compressing the key-value cache and employing sparse computation 1. The models are pretrained on a vast 8.1 trillion token dataset and refined through supervised fine-tuning and reinforcement learning 1. For more compact needs, DeepSeek V2-Lite offers a 16 billion parameter model, manageable on a single 40GB GPU 8. Additionally, DeepSeek Coder V2 caters specifically to programming, supporting 338 languages 2, while DeepSeek V2.5 blends general and coding abilities to enhance benchmarks 3. This family is recognized for balancing high performance with resource efficiency.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

6 in view

DeepSeek V2.5Current

Use when the workload needs 128k context and function calling.

2024-07128k contextfunction calling

DeepSeek V2 Chat (0628)Current

Use when the workload needs 128k context and 236B parameters.

2024-06128k context236B parameters

DeepSeek V2 LiteCurrent

Use when the workload needs 32k context and 16B parameters.

2024-0532k context16B parameters

DeepSeek V2 Lite ChatCurrent

Use when the workload needs 32k context and 16B parameters.

2024-0532k context16B parameters

DeepSeek V2Current

Use when the workload needs 128k context, 236B parameters, and structured outputs.

2024-05128k context236B parametersstructured outputs

DeepSeek V2 ChatCurrent

Use when the workload needs 128k context and 236B parameters.

2024-05128k context236B parameters

Current DeepSeek V2 variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
DeepSeek V2.5	Use when the workload needs 128k context and function calling.	2024-07	128k contextfunction calling	Current
DeepSeek V2 Chat (0628)	Use when the workload needs 128k context and 236B parameters.	2024-06	128k context236B parameters	Current
DeepSeek V2 Lite	Use when the workload needs 32k context and 16B parameters.	2024-05	32k context16B parameters	Current
DeepSeek V2 Lite Chat	Use when the workload needs 32k context and 16B parameters.	2024-05	32k context16B parameters	Current
DeepSeek V2	Use when the workload needs 128k context, 236B parameters, and structured outputs.	2024-05	128k context236B parametersstructured outputs	Current
DeepSeek V2 Chat	Use when the workload needs 128k context and 236B parameters.	2024-05	128k context236B parameters	Current

Release Timeline

3 release groups

2024-07

1 current

DeepSeek V2.5

128k contextfunction calling

Current

2024-06

1 current

DeepSeek V2 Chat (0628)

128k context236B parameters

Current

2024-05

4 current

DeepSeek V2

128k context236B parametersstructured outputs

Current

DeepSeek V2 Chat

128k context236B parameters

Current

DeepSeek V2 Lite

32k context16B parameters

Current

DeepSeek V2 Lite Chat

32k context16B parameters

Current

Specifications(6 models)

DeepSeek V2 model specifications comparison
Model	Released	Context	Parameters	Fn Calling	Structured Outputs
DeepSeek V2.5	2024-07	128k	238B total, 21B active (MoE)	Yes	No
DeepSeek V2 Chat (0628)	2024-06	128k	236B	No	No
DeepSeek V2 Lite	2024-05	32k	16B	No	No
DeepSeek V2 Lite Chat	2024-05	32k	16B	No	No
DeepSeek V2	2024-05	128k	236B	No	Yes
DeepSeek V2 Chat	2024-05	128k	236B	No	No

Available From(2 providers)

DeepSeek Platform

Fireworks AI

Pricing

DeepSeek V2 model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
DeepSeek V2	DeepSeek Platform	$0.14	$0.28	Serverless
DeepSeek V2 Lite Chat	Fireworks AI	$0.2	$0.2	Serverless
DeepSeek V2.5	Fireworks AI	$0.56	$1.68	Serverless

Frequently Asked Questions

What is DeepSeek V2 used for?: DeepSeek V2 is used for agent workflows and tool use, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does DeepSeek V2 compare to Janus?: DeepSeek V2 by DeepSeek is strongest where you need agent workflows and tool use, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek V2 has 6 listed variants and reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which DeepSeek V2 model should I use?: For the lowest listed input price, start with DeepSeek V2 through DeepSeek Platform at $0.14/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek V2.5 with 128k context and function calling.

Models(6)

DeepSeek V2.5

2024-07128k238B total, 21B active (MoE)1 provider

DeepSeek V2 Chat (0628)

DeepSeek V2 Lite

DeepSeek V2 Lite Chat

2024-0532k16B1 provider

Open Source

DeepSeek V2

2024-05128k236B1 provider

Open Source

DeepSeek V2 Chat

2024-05128k236B

Open Source

DeepSeek V2 Models by DeepSeek

About

Current Variants

Release Timeline

Specifications(6 models)

Available From(2 providers)

Pricing

Frequently Asked Questions

Related Model Families

Models(6)