DeepSeek R1 Distill Qwen-32B

Name: DeepSeek R1 Distill Qwen-32B
Author: DeepSeek

Released

2025-01-20

Last refreshed

2026-06-29

Status

Researched 47d ago

Open sourceCommercial use: permittedRAGLong contextClassificationJSON / Tool use

DeepSeek R1 Distill Qwen-32B is worth evaluating for rag, long context, and classification when its provider route and context window match the workload.

Use it for

Teams evaluating rag, long context, and classification
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: DeepSeek R1
Released: 2025-01-20
Context: 128k
Parameters: 32B
Architecture: Decoder Only
Specialization: general
Openness: Open source
License: MITOSI-approvedCommercial use: permitted
Weights: Unknown
Code: Unknown
Training: Multi-stage

Created by

DeepSeek

Advancing artificial general intelligence (AGI).

Hangzhou, Zhejiang, China

Founded 2023

Website

Pricing

Output / 1M

$0.290

Input / 1M

$0.290

Cheapest of 5 routes · OpenRouter

Providers(5)

Cloudflare Workers AI OpenRouter Fireworks AI NVIDIA NIM Novita AI

View 5 provider routes

About

DeepSeek R1 Distill Qwen-32B is DeepSeek's DeepSeek R1 model with an optional reasoning mode. It offers a 128K-token context window with weights openly available for self-hosting.

DeepSeek R1 Distill Qwen-32B is a reasoning-specialized language model released by DeepSeek on January 20, 2025 under the MIT license. The model has approximately 32.8 billion parameters and is produced by knowledge distillation: reasoning traces and outputs from DeepSeek R1 are used to fine-tune a Qwen2.5-32B base model, transferring R1's chain-of-thought reasoning patterns into a compact dense model without running full R1 inference at serving time. The context window is 128,000 tokens.

The distillation approach produces a model that generates extended reasoning before arriving at a final answer, similar in behavior to OpenAI's o1-style models. On mathematical benchmarks at release, it achieved approximately 94.3% accuracy on MATH and outperformed OpenAI's o1-mini on several reasoning-focused evaluations. The MIT license allows unconstrained commercial and research use, making it one of the most permissively licensed high-quality reasoning models available.

DeepSeek R1 Distill Qwen-32B is available on OpenRouter, Fireworks AI, NVIDIA NIM, Novita AI, and Groq, and can be self-hosted from Hugging Face (deepseek-ai/DeepSeek-R1-Distill-Qwen-32B). For organizations needing strong open reasoning capability at 32B scale, it is a primary choice alongside QwQ-32B. The model does not include tool use or code execution capabilities natively, but integrates well into structured reasoning pipelines.

DeepSeek R1 Distill Qwen-32B has a 128k-token context window.

DeepSeek R1 Distill Qwen-32B input tokens at $0.29/1M, output at $0.29/1M.

Top use-case fit

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Classification

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 5

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
OpenRouter	$0.290	$0.290	Serverless
Novita AI	$0.300	$0.300	Serverless
Fireworks AI	$0.900	$0.900	Serverless
Cloudflare Workers AI	$0.497	$4.88	Serverless

Available via routers & gateways(2)

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughFireworks AI

NVIDIA LLM Router Blueprint

Router

NVIDIA's open-source AI blueprint for LLM routing that selects the optimal model per prompt via intent classification or neural auto-routing; being deprecated 2026-06-20.

Free OSSNVIDIA NIM