DeepSeek Coder Models by DeepSeek
About
The DeepSeek Coder family includes a range of open-source code language models specifically designed for handling large codebases. Trained on an expansive dataset of 2 trillion tokens, primarily composed of code (87%) and a mix of English and Chinese natural language data (13%), these models are available in sizes from 1.3 billion to 33 billion parameters. This range gives users the flexibility to choose models that align with their computational resources and specific needs. With pre-training on a high-quality project-level code corpus and using a 16K window size, the models excel in code generation and infill tasks. They demonstrate state-of-the-art performance on various open-source code benchmarks and often outperform some proprietary models. Released under a permissive license, DeepSeek Coder models support both research and commercial applications, offering significant capabilities to developers in coding projects 145.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs code, 16k context, and 33B parameters.
Use when the workload needs code, 16k context, and 33B parameters.
Use when the workload needs code, 16k context, and 6.7B parameters.
Use when the workload needs 16k context, 33B parameters, and structured outputs.
Use when the workload needs code, 16k context, and 7B parameters.
Use when the workload needs code, 16k context, and 7B parameters.
Use when the workload needs code, 4k context, and 1.3B parameters.
Use when the workload needs code, 16k context, and 1.3B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| DeepSeek Coder 33B | Use when the workload needs code, 16k context, and 33B parameters. | 2024-03 | code16k context33B parameters | Current |
| DeepSeek Coder 33B Instruct | Use when the workload needs code, 16k context, and 33B parameters. | 2024-03 | code16k context33B parameters | Current |
| DeepSeek Coder 6.7B Instruct | Use when the workload needs code, 16k context, and 6.7B parameters. | 2024-03 | code16k context6.7B parameters | Current |
| Together AI Deepseek-Coder-33B-Instruct | Use when the workload needs 16k context, 33B parameters, and structured outputs. | 2024-03 | 16k context33B parametersstructured outputs | Current |
| DeepSeek Coder 7B V1.5 | Use when the workload needs code, 16k context, and 7B parameters. | 2024-02 | code16k context7B parameters | Current |
| DeepSeek Coder 7B V1.5 Instruct | Use when the workload needs code, 16k context, and 7B parameters. | 2024-02 | code16k context7B parameters | Current |
| DeepSeek Coder 1.3B | Use when the workload needs code, 4k context, and 1.3B parameters. | 2023-11 | code4k context1.3B parameters | Current |
| DeepSeek Coder 1.3B Instruct | Use when the workload needs code, 16k context, and 1.3B parameters. | 2023-11 | code16k context1.3B parameters | Current |
Release Timeline
3 release groupsSpecifications(9 models)
| Model | Released | Context | Parameters | Structured Outputs |
|---|---|---|---|---|
| DeepSeek Coder 33B | 2024-03 | 16k | 33B | Yes |
| DeepSeek Coder 33B Instruct | 2024-03 | 16k | 33B | No |
| DeepSeek Coder 6.7B Instruct | 2024-03 | 16k | 6.7B | No |
| Together AI Deepseek-Coder-33B-Instruct | 2024-03 | 16k | 33B | Yes |
| DeepSeek Coder 7B V1.5 | 2024-02 | 16k | 7B | No |
| DeepSeek Coder 7B V1.5 Instruct | 2024-02 | 16k | 7B | No |
| DeepSeek Coder 1.3B | 2023-11 | 4k | 1.3B | No |
| DeepSeek Coder 1.3B Instruct | 2023-11 | 16k | 1.3B | No |
Available From(4 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| DeepSeek Coder 1.3B | Fireworks AI | $0.1 | $0.1 | Provisioned |
| DeepSeek Coder 7B V1.5 | Fireworks AI | $0.2 | $0.2 | Provisioned |
| Together AI Deepseek-Coder-33B-Instruct | Together AI | $0.3 | $0.3 | Serverless |
| DeepSeek Coder 33B | Together AI | $0.8 | $0.8 | Serverless |
| DeepSeek Coder 33B | Fireworks AI | $0.9 | $0.9 | Provisioned |
| DeepSeek Coder 33B Instruct | Fireworks AI | $0.9 | $0.9 | Serverless |
Frequently Asked Questions
- What is DeepSeek Coder used for?
- DeepSeek Coder is used for coding, code, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
- How does DeepSeek Coder compare to Janus?
- DeepSeek Coder by DeepSeek is strongest where you need coding, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek Coder has 9 listed variants and reaches up to 16k context, so compare the specs and pricing tables before choosing a production model.
- Which DeepSeek Coder model should I use?
- For the lowest listed input price, start with DeepSeek Coder 1.3B through Fireworks AI at $0.1/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek Coder 33B with 16k context and structured outputs.
Models(9)
DeepSeek Coder 33B
DeepSeek Coder 33B Instruct
DeepSeek Coder 6.7B Instruct
Together AI Deepseek-Coder-33B-Instruct
DeepSeek Coder 7B V1.5
DeepSeek Coder 7B V1.5 Instruct
DeepSeek Coder 1.3B
DeepSeek Coder 1.3B Instruct





