LLM Reference

DeepSeek R1 0528 Distill Qwen3-8B

Released
2025-01-01
Last refreshed
2026-05-19
Status
Researched 44d ago
Open sourceCommercial use: permittedLong context

DeepSeek R1 0528 Distill Qwen3-8B is worth evaluating for long context when its provider route and context window match the workload.

Use it for

  • Teams evaluating long context
  • Workloads that can use a 160k context window
  • Buyers comparing 1 tracked provider route

Do not use it for

  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Family
Qwen3
Released
2025-01-01
Context
160k
Parameters
8B
Architecture
Decoder Only
Specialization
general
Openness
Open source
License
Apache 2.0OSI-approvedCommercial use: permitted
Training
Pretrained
Created by

AI research institute of Alibaba Group.

Hangzhou, Zhejiang, China
Founded 2017
Website
Pricing
Output / 1M
$0.200
Input / 1M
$0.200

Cheapest of 1 route · Fireworks AI

About

DeepSeek R1 0528 Distill Qwen3-8B is Alibaba's Qwen3 model with an optional reasoning mode. It offers a 160K-token context window with weights openly available for self-hosting.

DeepSeek R1 0528 Distill Qwen3-8B is an open-source model in the Qwen3 family. The structured metadata tracks a 160k-token context window and reasoning. This page tracks provider routes through Fireworks AI, with the cheapest tracked route listed at $0.2 input and $0.2 output per 1M tokens. No headline benchmark score is tracked for DeepSeek R1 0528 Distill Qwen3-8B yet.

Top use-case fit

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
Fireworks AI$0.200$0.200
Serverless

Available via routers & gateways(1)

Capabilities

Reasoning

Benchmark peer barsfor Long context

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Compare DeepSeek R1 0528 Distill Qwen3-8B with other models

Frequently asked questions

What is the context window of DeepSeek R1 0528 Distill Qwen3-8B?

DeepSeek R1 0528 Distill Qwen3-8B has a context window of 160k tokens.

How much does DeepSeek R1 0528 Distill Qwen3-8B cost?

DeepSeek R1 0528 Distill Qwen3-8B is available at $0.2/1M input tokens through Fireworks AI.

When was DeepSeek R1 0528 Distill Qwen3-8B released?

DeepSeek R1 0528 Distill Qwen3-8B was released on 2025-01-01.

Which providers offer DeepSeek R1 0528 Distill Qwen3-8B?

DeepSeek R1 0528 Distill Qwen3-8B is available from 1 provider: Fireworks AI.