LLM Reference

Qwen-Flash

Released
2025-08-01
Last refreshed
2026-04-21
Status
Researched 46d ago
Open SourceCommercial use allowedLong context

Qwen-Flash is worth evaluating for long context when its provider route and context window match the workload.

Use it for

  • Teams evaluating long context
  • Workloads that can use a 1m context window
  • Buyers comparing 1 tracked provider route

Do not use it for

  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Family
Qwen3
Released
2025-08-01
Context
1m
Openness
Open source
License
Apache 2.0(OSI)Commercial use allowed
Created by

AI research institute of Alibaba Group.

Hangzhou, Zhejiang, China
Founded 2017
Website
Pricing
Output / 1M
$2.00
Input / 1M
$0.250

Cheapest of 1 route · Alibaba Cloud PAI-EAS

About

Qwen-Flash is a Qwen3 series Flash model that seamlessly integrates thinking and non-thinking modes switchable mid-dialogue, excelling at complex thinking tasks with significant improvements in instruction adherence and text understanding. It supports 1M context length with tiered pricing based on context length.

Qwen-Flash is an open-source model in the Qwen3 family. The structured metadata tracks a 1m-token context window. This page tracks provider routes through Alibaba Cloud PAI-EAS, with the cheapest tracked route listed at $0.25 input and $2 output per 1M tokens. No headline benchmark score is tracked for Qwen-Flash yet.

Top use-case fit

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
Alibaba Cloud PAI-EAS$0.250$2.00
Serverless

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Long context

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.