Qwen2.5-Coder Models by Alibaba
Details
Capabilities
About
The Qwen 2.5 Coder family is a sophisticated language model family designed for programming tasks and general computational reasoning. Developed with scalability in mind, these models range from 0.5 billion to 32 billion parameters, supporting extensive contexts up to 128,000 tokens. They demonstrate proficiency across 92 programming languages and excel in tasks like code generation, repair, and multi-language programming challenges. Remarkably, the 7-billion parameter variant outperforms much larger models like DeepSeek-Coder-V2-Lite on specific benchmarks, illustrating its efficiency and innovation. The family includes both base and instruction-tuned models. The instruction-tuned "Coder-Instruct" models enhance performance on various tasks and showcase superior generalization. These models are rigorously benchmarked on datasets such as McEval for multi-language programming and CRUXEval for reasoning, yielding exceptional results in code inference and mathematical tasks. The integration of diverse datasets maintains strong general capabilities, ensuring these models are versatile across technical and non-technical domains. Qwen 2.5 Coder is open-sourced under the Apache 2.0 license, encouraging community experimentation and deployment. The series' next iteration, with a 32-billion parameter model, is in development, promising even greater advancements in code intelligence. Practical applications, including code assistants and artifact generation tools, highlight its readiness for real-world scenarios, empowering developers with an accessible, powerful coding solution.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs code, 128k context, and 14B parameters.
Use when the workload needs code, 128k context, and 14B parameters.
Use when the workload needs code, 128k context, and 32B parameters.
Use when the workload needs code, 128k context, and 32B parameters.
Use when the workload needs code, 32k context, and 3B parameters.
Use when the workload needs code, 32k context, and 3B parameters.
Use when the workload needs code, 32k context, and 500M parameters.
Use when the workload needs code, 32k context, and 500M parameters.
Use when the workload needs code, 32k context, and 1.5B parameters.
Use when the workload needs code, 32k context, and 1.5B parameters.
Use when the workload needs code, 128k context, and 7.6B parameters.
Use when the workload needs code, 128k context, and 7.6B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Qwen2.5-Coder-14B | Use when the workload needs code, 128k context, and 14B parameters. | 2024-11 | code128k context14B parameters | Current |
| Qwen2.5-Coder-14B-Instruct | Use when the workload needs code, 128k context, and 14B parameters. | 2024-11 | code128k context14B parameters | Current |
| Qwen2.5-Coder-32B | Use when the workload needs code, 128k context, and 32B parameters. | 2024-11 | code128k context32B parameters | Current |
| Qwen2.5-Coder-32B-Instruct | Use when the workload needs code, 128k context, and 32B parameters. | 2024-11 | code128k context32B parameters | Current |
| Qwen2.5-Coder-3B | Use when the workload needs code, 32k context, and 3B parameters. | 2024-11 | code32k context3B parameters | Current |
| Qwen2.5-Coder-3B-Instruct | Use when the workload needs code, 32k context, and 3B parameters. | 2024-11 | code32k context3B parameters | Current |
| Qwen2.5-Coder-0.5B | Use when the workload needs code, 32k context, and 500M parameters. | 2024-11 | code32k context500M parameters | Current |
| Qwen2.5-Coder-0.5B-Instruct | Use when the workload needs code, 32k context, and 500M parameters. | 2024-11 | code32k context500M parameters | Current |
| Qwen2.5-Coder-1.5B | Use when the workload needs code, 32k context, and 1.5B parameters. | 2024-09 | code32k context1.5B parameters | Current |
| Qwen2.5-Coder-1.5B-Instruct | Use when the workload needs code, 32k context, and 1.5B parameters. | 2024-09 | code32k context1.5B parameters | Current |
| Qwen2.5-Coder-7B | Use when the workload needs code, 128k context, and 7.6B parameters. | 2024-09 | code128k context7.6B parameters | Current |
| Qwen2.5-Coder-7B-Instruct | Use when the workload needs code, 128k context, and 7.6B parameters. | 2024-09 | code128k context7.6B parameters | Current |
Release Timeline
2 release groupsSpecifications(12 models)
| Model | Released | Context | Parameters | Structured Outputs | Code Exec |
|---|---|---|---|---|---|
| Qwen2.5-Coder-14B | 2024-11 | 128k | 14B | No | No |
| Qwen2.5-Coder-14B-Instruct | 2024-11 | 128k | 14B | No | No |
| Qwen2.5-Coder-32B | 2024-11 | 128k | 32B | Yes | Yes |
| Qwen2.5-Coder-32B-Instruct | 2024-11 | 128k | 32B | Yes | Yes |
| Qwen2.5-Coder-3B | 2024-11 | 32k | 3B | No | No |
| Qwen2.5-Coder-3B-Instruct | 2024-11 | 32k | 3B | No | No |
| Qwen2.5-Coder-0.5B | 2024-11 | 32k | 0.5B | No | No |
| Qwen2.5-Coder-0.5B-Instruct | 2024-11 | 32k | 0.5B | No | No |
| Qwen2.5-Coder-1.5B | 2024-09 | 32k | 1.54B | No | No |
| Qwen2.5-Coder-1.5B-Instruct | 2024-09 | 32k | 1.54B | No | No |
| Qwen2.5-Coder-7B | 2024-09 | 128k | 7.61B | No | No |
| Qwen2.5-Coder-7B-Instruct | 2024-09 | 128k | 7.61B | Yes | No |
Available From(7 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Qwen2.5-Coder-1.5B-Instruct | Fireworks AI | $0.1 | $0.1 | Serverless |
| Qwen2.5-Coder-3B-Instruct | Fireworks AI | $0.1 | $0.1 | Serverless |
| Qwen2.5-Coder-32B-Instruct | SiliconFlow | $0.18 | $0.18 | Serverless |
| Qwen2.5-Coder-32B | DeepInfra | $0.2 | $0.2 | Serverless |
| Qwen2.5-Coder-14B-Instruct | Fireworks AI | $0.2 | $0.2 | Serverless |
| Qwen2.5-Coder-7B-Instruct | Fireworks AI | $0.2 | $0.2 | Serverless |
| Qwen2.5-Coder-32B-Instruct | Arcee AI | $0.4 | $1.2 | Serverless |
| Qwen2.5-Coder-32B-Instruct | Cloudflare Workers AI | $0.66 | $1 | Serverless |
| Qwen2.5-Coder-32B-Instruct | OpenRouter | $0.66 | $1 | Serverless |
| Qwen2.5-Coder-32B-Instruct | Fireworks AI | $0.9 | $0.9 | Serverless |
| Qwen2.5-Coder-32B | Fireworks AI | $0.9 | $0.9 | Serverless |
Frequently Asked Questions
- What is Qwen2.5-Coder used for?
- Qwen2.5-Coder is used for coding, code, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
- How does Qwen2.5-Coder compare to Tongyi DeepResearch?
- Qwen2.5-Coder by Alibaba is strongest where you need coding, while Tongyi DeepResearch by Alibaba is the closest related family to check for adjacent model selection. Qwen2.5-Coder has 12 listed variants and reaches up to 128k context, while Tongyi DeepResearch reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
- Which Qwen2.5-Coder model should I use?
- For the lowest listed input price, start with Qwen2.5-Coder-1.5B-Instruct through Fireworks AI at $0.1/1M input tokens. For the most capable/latest local choice, evaluate Qwen2.5-Coder-32B with 128k context and structured outputs.
Models(12)
Qwen2.5-Coder-14B
Qwen2.5-Coder-14B-Instruct
Qwen2.5-Coder-32B
Qwen2.5-Coder-32B-Instruct
Qwen2.5-Coder-3B
Qwen2.5-Coder-3B-Instruct
Qwen2.5-Coder-0.5B
Qwen2.5-Coder-0.5B-Instruct
Qwen2.5-Coder-1.5B
Qwen2.5-Coder-1.5B-Instruct
Qwen2.5-Coder-7B
Qwen2.5-Coder-7B-Instruct






