LLM Reference

DeepSeek Models by DeepSeek

DeepSeekDeepSeek LicenseOpen weightsHighlight
8 models2023–2025Up to 164k ctxFrom $0.21/1M input

Details

ResearcherDeepSeek
Commercial useCommercial use allowed
Models8
Released2023–2025
Max context164k

Capabilities

Structured Outputs5 of 8 models
Code Execution1 of 8 models

About

The DeepSeek LLM family includes open-source large language models designed for exceptional language comprehension and diverse applications 410. These models shine in reasoning, coding, mathematics, and Chinese comprehension, often surpassing similar models in benchmarks 410. The lineup features base and chat models with parameter sizes of 7 billion and 67 billion, respectively 410. They are trained with a massive dataset of 2 trillion tokens in English and Chinese 410, and the architecture, based on the Llama model, enhances inference efficiency through Grouped-Query Attention in the 67B model 1. Available for research and commercial use, additional models like DeepSeek-Coder and DeepSeek-VL cater to code generation and vision-language tasks, respectively 89.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

8 in view

Use when the workload needs 164k context and structured outputs.

2025-09164k contextstructured outputs

Use when the workload needs 164k context and structured outputs.

2025-04164k contextstructured outputs

Use when the workload needs 164k context, structured outputs, and code execution.

2025-04164k contextstructured outputscode execution

Use when the workload needs 4k context, 67B parameters, and structured outputs.

2024-014k context67B parametersstructured outputs

Use when the workload needs 4k context, 67B parameters, and structured outputs.

2023-114k context67B parametersstructured outputs

Use when the workload needs 4k context and 7B parameters.

2023-114k context7B parameters

Use when the workload needs 4k context and 67B parameters.

2023-114k context67B parameters

Use when the workload needs 4k context and 7B parameters.

2023-114k context7B parameters

Release Timeline

4 release groups
2025-09
1 current
DeepSeek V3.1 Terminus
164k contextstructured outputs
Current
2025-04
2 current
DeepSeek V3.2 Exp
164k contextstructured outputscode execution
Current
DeepSeek V3.2 Speciale
164k contextstructured outputs
Current
2024-01
1 current
Together AI Deepseek-LLM-67B-Chat
4k context67B parametersstructured outputs
Current
2023-11
4 current
DeepSeek 67B
4k context67B parameters
Current
DeepSeek 67B Chat
4k context67B parametersstructured outputs
Current
DeepSeek 7B
4k context7B parameters
Current
DeepSeek 7B Chat
4k context7B parameters
Current

Specifications(8 models)

DeepSeek model specifications comparison
ModelReleasedContextParametersStructured OutputsCode Exec
DeepSeek V3.1 Terminus2025-09164k671B total, 37B active (MoE)YesNo
DeepSeek V3.2 Speciale2025-04164k685B total, 37B active (MoE)YesNo
DeepSeek V3.2 Exp2025-04164k685B total, 37B active (MoE)YesYes
Together AI Deepseek-LLM-67B-Chat2024-014k67BYesNo
DeepSeek 67B Chat2023-114k67BYesNo
DeepSeek 7B Chat2023-114k7BNoNo
DeepSeek 67B2023-114k67BNoNo
DeepSeek 7B2023-114k7BNoNo

Pricing

DeepSeek model pricing by provider
ModelProviderInput / 1MOutput / 1MType
DeepSeek V3.1 TerminusOpenRouter$0.21$0.79Serverless
DeepSeek V3.2 ExpOpenRouter$0.27$0.41Serverless
DeepSeek V3.1 TerminusVercel AI Gateway$0.27$1Serverless
DeepSeek V3.2 ExpNovita AI$0.27$0.41Serverless
DeepSeek V3.1 TerminusNovita AI$0.27$1Serverless
DeepSeek V3.2 SpecialeDeepSeek Platform$0.28$0.42Serverless
DeepSeek V3.2 ExpDeepSeek Platform$0.28$0.42Serverless
DeepSeek V3.2 SpecialeOpenRouter$0.4$1.2Serverless
Together AI Deepseek-LLM-67B-ChatTogether AI$0.6$0.6Serverless
DeepSeek 67B ChatTogether AI$0.9$0.9Serverless

Frequently Asked Questions

What is DeepSeek used for?
DeepSeek is used for structured outputs, code execution, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does DeepSeek compare to Janus?
DeepSeek by DeepSeek is strongest where you need structured outputs, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek has 8 listed variants and reaches up to 164k context, so compare the specs and pricing tables before choosing a production model.
Which DeepSeek model should I use?
For the lowest listed input price, start with DeepSeek V3.1 Terminus through OpenRouter at $0.21/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek V3.2 Exp with 164k context and structured outputs.