Qwen2-7B

Name: Qwen2-7B
Author: Alibaba

Released

2024-06-05

Last refreshed

2026-07-11

Status

Researched 69d ago

Open sourceCommercial use: permittedCodingRAGLong contextClassificationJSON / Tool use

Qwen2-7B is worth evaluating for coding, rag, and long context when its provider route and context window match the workload.

Use it for

Teams evaluating coding, rag, and long context
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Qwen2
Released: 2024-06-05
Context: 128k
Parameters: 7.07B
Architecture: Decoder Only
Specialization: general
Openness: Open source
License: Apache 2.0OSI-approvedCommercial use: permitted
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

Alibaba

AI research institute of Alibaba Group.

Hangzhou, Zhejiang, China

Founded 2017

Website

Pricing

Output / 1M

$0.150

Input / 1M

$0.050

Cheapest of 5 routes · DeepInfra

Providers(5)

DeepInfra OctoAI API (Deprecated)Microsoft Foundry Fireworks AI NVIDIA NIM

View 5 provider routes

About

Qwen2-7B is Alibaba's Qwen2 model. It offers a 128K-token context window and scores 55.4 on GPQA.

Qwen2-7B is an open-source model in the Qwen2 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through DeepInfra, OctoAI API (Deprecated), Microsoft Foundry, and 2 more, with the cheapest tracked route listed at $0.05 input and $0.15 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 55.4, HellaSwag 92.0, and HumanEval 80.9.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ A

1 relevant benchmark in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 5

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
DeepInfra	$0.050	$0.150	Serverless
Microsoft Foundry	$0.150	$0.150	Provisioned
Fireworks AI	$0.200	$0.200	Serverless
NVIDIA NIM	-	-	ServerlessPartial

Available via routers & gateways(7)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSMicrosoft Foundry

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionMicrosoft Foundry

Azure AI Foundry Model Router

Router

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

PassthroughMicrosoft Foundry

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionMicrosoft Foundry

Kong AI Gateway

Gateway

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

SubscriptionMicrosoft Foundry

Capabilities

Structured Outputs

Benchmark peer barsfor Coding

HumanEvalRank 39 of 97

Claude Sonnet 4.6

98.0

96.7

Claude Opus 4.6

95.0

Grok-3

94.5

Qwen2-7Bcurrent

80.9

Benchmark scores(5)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
Google-Proof Q&A	55.4	diamondObserved 2026-03-06	—	Source
HellaSwag	92.0	10-shotObserved 2026-03-06	—	Source
HumanEval	80.9	pass@1Observed 2026-03-06	—	Source
Massive Multitask Language Understanding	80.2	5-shotObserved 2026-03-06	—	Source
Grade School Math 8K	82.7	—Observed 2026-05-28	—	Source

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(2)

Best LLMs for ClassificationListed Best Small Language Models (SLMs)Listed

Compare Qwen2-7B with other models

Qwen2-7B vs Llama 3.1 405B3

Comparison and alternatives

Browse all comparisons →

Qwen2-7B vs Llama 3.1 405B

Frequently asked questions

What is the context window of Qwen2-7B?

Qwen2-7B has a context window of 128k tokens.

How much does Qwen2-7B cost?

Qwen2-7B pricing ranges from $0.05/1M to $0.2/1M input tokens depending on the provider.

When was Qwen2-7B released?

Qwen2-7B was released on 2024-06-05.

Which providers offer Qwen2-7B?

Qwen2-7B is available from 5 providers: DeepInfra, OctoAI API (Deprecated), Microsoft Foundry, Fireworks AI, NVIDIA NIM.

What benchmarks has Qwen2-7B been tested on?

Qwen2-7B has been evaluated on 5 benchmarks, including Google-Proof Q&A, HellaSwag, HumanEval, Massive Multitask Language Understanding, Grade School Math 8K.

Created by

Alibaba

AI research institute of Alibaba Group.

Hangzhou, Zhejiang, China

Founded 2017

Website

Pricing

Output / 1M

$0.150

Input / 1M

$0.050

Cheapest of 5 routes · DeepInfra

Providers(5)

DeepInfra OctoAI API (Deprecated)Microsoft Foundry Fireworks AI NVIDIA NIM

View 5 provider routes