DeepSeek R1 Models by DeepSeek
About
DeepSeek R1 is a family of large language models designed specifically for advanced reasoning tasks by DeepSeek, a leading Chinese AI firm. The initial release in this model line, DeepSeek-R1-Lite-Preview, is tailored to excel in logical inference, mathematical reasoning, and real-time problem-solving. This model introduces a "chain-of-thought" reasoning capability, allowing users to track the model's reasoning steps in solving complex problems. Notably, it performs comparably to OpenAI's o1-preview model on certain benchmarks like AIME and MATH. However, at the time of writing, independent verification is pending, as there is no API access or full code release yet. DeepSeek aims to ultimately provide an open-source version of the R1 model along with an accessible API. Initial tests showcase impressive capabilities, although some challenges remain as the model occasionally encounters difficulties with specific logic problems 12348.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 130k context, reasoning, and structured outputs.
Use when the workload needs 128k context, reasoning, and structured outputs.
Use when the workload needs 128k context and reasoning.
Use when the workload needs 128k context, 1.5B parameters, and reasoning.
Use when the workload needs 128k context, 7B parameters, and reasoning.
Use when the workload needs 128k context, 8B parameters, and reasoning.
Use when the workload needs 128k context, 14B parameters, and reasoning.
Use when the workload needs 128k context, 32B parameters, and reasoning.
Use when the workload needs 128k context, 70B parameters, and reasoning.
Use when the workload needs 160k context, 671B parameters, and reasoning.
Use when the workload needs 128k context and reasoning.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| DeepSeek R1 0528 | Use when the workload needs 130k context, reasoning, and structured outputs. | 2025-05 | 130k contextreasoningstructured outputs | Current |
| DeepSeek R1 | Use when the workload needs 128k context, reasoning, and structured outputs. | 2025-01 | 128k contextreasoningstructured outputs | Current |
| DeepSeek R1 Zero | Use when the workload needs 128k context and reasoning. | 2025-01 | 128k contextreasoning | Current |
| DeepSeek R1 Distill Qwen-1.5B | Use when the workload needs 128k context, 1.5B parameters, and reasoning. | 2025-01 | 128k context1.5B parametersreasoning | Current |
| DeepSeek R1 Distill Qwen-7B | Use when the workload needs 128k context, 7B parameters, and reasoning. | 2025-01 | 128k context7B parametersreasoning | Current |
| DeepSeek R1 Distill Llama 8B | Use when the workload needs 128k context, 8B parameters, and reasoning. | 2025-01 | 128k context8B parametersreasoning | Current |
| DeepSeek R1 Distill Qwen-14B | Use when the workload needs 128k context, 14B parameters, and reasoning. | 2025-01 | 128k context14B parametersreasoning | Current |
| DeepSeek R1 Distill Qwen-32B | Use when the workload needs 128k context, 32B parameters, and reasoning. | 2025-01 | 128k context32B parametersreasoning | Current |
| DeepSeek R1 Distill Llama 70B | Use when the workload needs 128k context, 70B parameters, and reasoning. | 2025-01 | 128k context70B parametersreasoning | Current |
| DeepSeek R1 Basic | Use when the workload needs 160k context, 671B parameters, and reasoning. | 2025-01 | 160k context671B parametersreasoning | Current |
| DeepSeek R1 Lite | Use when the workload needs 128k context and reasoning. | 2024-11 | 128k contextreasoning | Current |
Release Timeline
3 release groupsSpecifications(11 models)
| Model | Released | Context | Parameters | Reasoning | Structured Outputs | Code Exec |
|---|---|---|---|---|---|---|
| DeepSeek R1 0528 | 2025-05 | 130k | 685B total, 37B active (MoE) | Yes | Yes | Yes |
| DeepSeek R1 | 2025-01 | 128k | 671B, 37B Active | Yes | Yes | Yes |
| DeepSeek R1 Zero | 2025-01 | 128k | 671B, 37B Active | Yes | No | No |
| DeepSeek R1 Distill Qwen-1.5B | 2025-01 | 128k | 1.5B | Yes | No | No |
| DeepSeek R1 Distill Qwen-7B | 2025-01 | 128k | 7B | Yes | No | No |
| DeepSeek R1 Distill Llama 8B | 2025-01 | 128k | 8B | Yes | No | No |
| DeepSeek R1 Distill Qwen-14B | 2025-01 | 128k | 14B | Yes | No | No |
| DeepSeek R1 Distill Qwen-32B | 2025-01 | 128k | 32B | Yes | Yes | No |
| DeepSeek R1 Distill Llama 70B | 2025-01 | 128k | 70B | Yes | Yes | No |
| DeepSeek R1 Basic | 2025-01 | 160k | 671B | Yes | No | No |
| DeepSeek R1 Lite | 2024-11 | 128k | — | Yes | No | No |
Available From(17 providers)
Pricing
Comparisons
- GPT-4o (08-06) vs DeepSeek R1
- o3 vs DeepSeek R1
- o1 (12-17) vs DeepSeek R1
- Claude Opus 4.6 vs DeepSeek R1
- Claude 3.7 Sonnet vs DeepSeek R1
- Gemini 2.5 Pro vs DeepSeek R1
- DeepSeek R1 vs Llama 3.3 70B
- DeepSeek R1 vs Grok 4
Frequently Asked Questions
- What is DeepSeek R1 used for?
- DeepSeek R1 is used for reasoning, structured outputs, and code execution. The family description and listed model capabilities point to those workloads as the best fit.
- How does DeepSeek R1 compare to Janus?
- DeepSeek R1 by DeepSeek is strongest where you need reasoning, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek R1 has 11 listed variants and reaches up to 160k context, so compare the specs and pricing tables before choosing a production model.
- Which DeepSeek R1 model should I use?
- For the lowest listed input price, start with DeepSeek R1 through Bitdeer AI at $0.1/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek R1 0528 with 130k context and reasoning and structured outputs.
Models(11)
DeepSeek R1 0528
DeepSeek R1
DeepSeek R1 Zero
DeepSeek R1 Distill Qwen-1.5B
DeepSeek R1 Distill Qwen-7B
DeepSeek R1 Distill Llama 8B
DeepSeek R1 Distill Qwen-14B
DeepSeek R1 Distill Qwen-32B
DeepSeek R1 Distill Llama 70B
DeepSeek R1 Basic
DeepSeek R1 Lite




