LLM Reference

DeepSeek R1 Distill Qwen-32B

Released
2025-01-20
Last refreshed
2026-06-01
Status
Researched 3d ago
Open SourceRAGLong contextClassificationJSON / Tool use

DeepSeek R1 Distill Qwen-32B is worth evaluating for rag, long context, and classification when its provider route and context window match the workload.

Use it for

  • Teams evaluating rag, long context, and classification
  • Workloads that can use a 128k context window
  • Buyers comparing 4 tracked provider routes

Do not use it for

  • Vision or document-understanding workloads
Specifications
Released
2025-01-20
Context
128k
Parameters
32B
Architecture
Decoder Only
Specialization
general
Training
multistage
Fine-tuning
task_specific
Created by

Advancing artificial general intelligence (AGI).

Hangzhou, Zhejiang, China
Founded 2023
Website
Pricing
Output / 1M
$0.290
Input / 1M
$0.290

Cheapest of 5 routes · OpenRouter

About

DeepSeek R1 Distill Qwen-32B is DeepSeek's DeepSeek R1 model with an optional reasoning mode. It offers a 128K-token context window with weights openly available for self-hosting.

DeepSeek R1 Distill Qwen-32B is a reasoning-specialized language model released by DeepSeek on January 20, 2025 under the MIT license. The model has approximately 32.8 billion parameters and is produced by knowledge distillation: reasoning traces and outputs from DeepSeek R1 are used to fine-tune a Qwen2.5-32B base model, transferring R1's chain-of-thought reasoning patterns into a compact dense model without running full R1 inference at serving time. The context window is 128,000 tokens.

The distillation approach produces a model that generates extended reasoning before arriving at a final answer, similar in behavior to OpenAI's o1-style models. On mathematical benchmarks at release, it achieved approximately 94.3% accuracy on MATH and outperformed OpenAI's o1-mini on several reasoning-focused evaluations. The MIT license allows unconstrained commercial and research use, making it one of the most permissively licensed high-quality reasoning models available.

DeepSeek R1 Distill Qwen-32B is available on OpenRouter, Fireworks AI, NVIDIA NIM, Novita AI, and Groq, and can be self-hosted from Hugging Face (deepseek-ai/DeepSeek-R1-Distill-Qwen-32B). For organizations needing strong open reasoning capability at 32B scale, it is a primary choice alongside QwQ-32B. The model does not include tool use or code execution capabilities natively, but integrates well into structured reasoning pipelines.

DeepSeek R1 Distill Qwen-32B has a 128k-token context window.

DeepSeek R1 Distill Qwen-32B input tokens at $0.29/1M, output at $0.29/1M.

Top use-case fit

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Classification

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 5

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
OpenRouter$0.290$0.290
Serverless
Novita AI$0.300$0.300
Serverless
Fireworks AI$0.900$0.900
Serverless
Cloudflare Workers AI$0.497$4.88
Serverless

Capabilities

ReasoningStructured Outputs

Benchmark peer barsfor RAG

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(10)