LLM Reference

DeepSeek R1 Models by DeepSeek

DeepSeekReasoningHighlight
11 models2024–2025Up to 160k ctxFrom $0.1/1M input

About

DeepSeek R1 is a family of large language models designed specifically for advanced reasoning tasks by DeepSeek, a leading Chinese AI firm. The initial release in this model line, DeepSeek-R1-Lite-Preview, is tailored to excel in logical inference, mathematical reasoning, and real-time problem-solving. This model introduces a "chain-of-thought" reasoning capability, allowing users to track the model's reasoning steps in solving complex problems. Notably, it performs comparably to OpenAI's o1-preview model on certain benchmarks like AIME and MATH. However, at the time of writing, independent verification is pending, as there is no API access or full code release yet. DeepSeek aims to ultimately provide an open-source version of the R1 model along with an accessible API. Initial tests showcase impressive capabilities, although some challenges remain as the model occasionally encounters difficulties with specific logic problems 12348.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

11 in view

Use when the workload needs 130k context, reasoning, and structured outputs.

2025-05130k contextreasoningstructured outputs

Use when the workload needs 128k context, reasoning, and structured outputs.

2025-01128k contextreasoningstructured outputs

Use when the workload needs 128k context and reasoning.

2025-01128k contextreasoning

Use when the workload needs 128k context, 1.5B parameters, and reasoning.

2025-01128k context1.5B parametersreasoning

Use when the workload needs 128k context, 7B parameters, and reasoning.

2025-01128k context7B parametersreasoning

Use when the workload needs 128k context, 8B parameters, and reasoning.

2025-01128k context8B parametersreasoning

Use when the workload needs 128k context, 14B parameters, and reasoning.

2025-01128k context14B parametersreasoning

Use when the workload needs 128k context, 32B parameters, and reasoning.

2025-01128k context32B parametersreasoning

Use when the workload needs 128k context, 70B parameters, and reasoning.

2025-01128k context70B parametersreasoning

Use when the workload needs 160k context, 671B parameters, and reasoning.

2025-01160k context671B parametersreasoning

Use when the workload needs 128k context and reasoning.

2024-11128k contextreasoning

Release Timeline

3 release groups
2025-05
1 current
DeepSeek R1 0528
130k contextreasoningstructured outputs
Current
2025-01
9 current
DeepSeek R1
128k contextreasoningstructured outputs
Current
DeepSeek R1 Basic
160k context671B parametersreasoning
Current
DeepSeek R1 Distill Llama 70B
128k context70B parametersreasoning
Current
DeepSeek R1 Distill Llama 8B
128k context8B parametersreasoning
Current
DeepSeek R1 Distill Qwen-1.5B
128k context1.5B parametersreasoning
Current
DeepSeek R1 Distill Qwen-14B
128k context14B parametersreasoning
Current
DeepSeek R1 Distill Qwen-32B
128k context32B parametersreasoning
Current
DeepSeek R1 Distill Qwen-7B
128k context7B parametersreasoning
Current
DeepSeek R1 Zero
128k contextreasoning
Current
2024-11
1 current
DeepSeek R1 Lite
128k contextreasoning
Current

Specifications(11 models)

DeepSeek R1 model specifications comparison
ModelReleasedContextParametersReasoningStructured OutputsCode Exec
DeepSeek R1 05282025-05130k685B total, 37B active (MoE)YesYesYes
DeepSeek R12025-01128k671B, 37B ActiveYesYesYes
DeepSeek R1 Zero2025-01128k671B, 37B ActiveYesNoNo
DeepSeek R1 Distill Qwen-1.5B2025-01128k1.5BYesNoNo
DeepSeek R1 Distill Qwen-7B2025-01128k7BYesNoNo
DeepSeek R1 Distill Llama 8B2025-01128k8BYesNoNo
DeepSeek R1 Distill Qwen-14B2025-01128k14BYesNoNo
DeepSeek R1 Distill Qwen-32B2025-01128k32BYesYesNo
DeepSeek R1 Distill Llama 70B2025-01128k70BYesYesNo
DeepSeek R1 Basic2025-01160k671BYesNoNo
DeepSeek R1 Lite2024-11128kYesNoNo

Available From(17 providers)

Pricing

DeepSeek R1 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
DeepSeek R1 Distill Qwen-1.5BFireworks AI$0.1$0.1Serverless
DeepSeek R1Bitdeer AI$0.1$0.3Serverless
DeepSeek R1 Distill Qwen-14BNovita AI$0.15$0.15Serverless
DeepSeek R1 Distill Llama 8BFireworks AI$0.2$0.2Serverless
DeepSeek R1 Distill Qwen-14BFireworks AI$0.2$0.2Serverless
DeepSeek R1 Distill Qwen-7BFireworks AI$0.2$0.2Serverless
DeepSeek R1SiliconFlow$0.25$0.8Serverless
DeepSeek R1 Distill Qwen-32BOpenRouter$0.29$0.29Serverless
DeepSeek R1 Distill Qwen-32BNovita AI$0.3$0.3Serverless
DeepSeek R1 Distill Llama 70BArcee AI$0.35$1.05Serverless
DeepSeek R1 Distill Qwen-32BCloudflare Workers AI$0.497$4.881Serverless
DeepSeek R1 0528OpenRouter$0.5$2.15Serverless
DeepSeek R1DeepSeek Platform$0.55$2.19Serverless
DeepSeek R1Fireworks AI$0.56$1.68Serverless
DeepSeek R1 0528Fireworks AI$0.56$1.68Serverless
DeepSeek R1 BasicFireworks AI$0.56$1.68Serverless
DeepSeek R1 Distill Llama 70BDeepInfra$0.7$0.8Serverless
DeepSeek R1OpenRouter$0.7$2.5Serverless
DeepSeek R1 Distill Llama 70BOpenRouter$0.7$0.8Serverless
DeepSeek R1 0528Novita AI$0.7$2.5Serverless
DeepSeek R1 Distill Llama 70BNovita AI$0.8$0.8Serverless
DeepSeek R1 Distill Llama 70BFireworks AI$0.9$0.9Serverless
DeepSeek R1 Distill Qwen-32BFireworks AI$0.9$0.9Serverless
DeepSeek R1AWS Bedrock$1.35$5.4Serverless
DeepSeek R1 0528GCP Vertex AI$1.35$5.4Serverless
DeepSeek R1GCP Vertex AI$1.35$5.4Serverless
DeepSeek R1Vercel AI Gateway$1.35$5.4Serverless
DeepSeek R1Together AI$3$7Serverless
DeepSeek R1 0528Together AI$3$7Serverless
DeepSeek R1Replicate API$3.75$10Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is DeepSeek R1 used for?
DeepSeek R1 is used for reasoning, structured outputs, and code execution. The family description and listed model capabilities point to those workloads as the best fit.
How does DeepSeek R1 compare to Janus?
DeepSeek R1 by DeepSeek is strongest where you need reasoning, while Janus by DeepSeek is the closest related family to check for image generation. DeepSeek R1 has 11 listed variants and reaches up to 160k context, so compare the specs and pricing tables before choosing a production model.
Which DeepSeek R1 model should I use?
For the lowest listed input price, start with DeepSeek R1 through Bitdeer AI at $0.1/1M input tokens. For the most capable/latest local choice, evaluate DeepSeek R1 0528 with 130k context and reasoning and structured outputs.

Models(11)