DeepSeek R1 Distill Qwen-7B
DeepSeek R1 Distill Qwen-7B is worth evaluating for long context when its provider route and context window match the workload.
Use it for
- Teams evaluating long context
- Workloads that can use a 128k context window
- Buyers comparing 2 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
- Family
- DeepSeek R1
- Released
- 2025-01-20
- Context
- 128k
- Parameters
- 7B
- Architecture
- Decoder Only
- Specialization
- general
- Training
- multistage
- Fine-tuning
- task_specific
Cheapest of 2 routes · Fireworks AI
About
DeepSeek R1 Distill Qwen-7B is DeepSeek's DeepSeek R1 model with an optional reasoning mode. It offers a 128K-token context window with weights openly available for self-hosting.
DeepSeek R1 Distill Qwen-7B is an open-source model in the DeepSeek R1 family. The structured metadata tracks a 128k-token context window and reasoning. This page tracks provider routes through Fireworks AI and NVIDIA NIM, with the cheapest tracked route listed at $0.2 input and $0.2 output per 1M tokens. No headline benchmark score is tracked for DeepSeek R1 Distill Qwen-7B yet.
Top use-case fit
Long context
Included by capability and metadata signals in the decision map.
Provider price ladder
Compare all 2Compare API pricing across 2 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Fireworks AI | $0.200 | $0.200 | Serverless |
| NVIDIA NIM | - | - | ServerlessPartial |
Capabilities
Benchmark peer barsfor Long context
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.