Qwen1.5
About
The Qwen1.5 family is an advanced series of large language models (LLMs) developed by Alibaba Cloud, serving as a beta precursor to the Qwen2 series 134. This collection includes eight model sizes, scaling from 0.5 billion to 72 billion parameters, and features a 14-billion parameter Mixture of Experts (MoE) model. Available in both base and fine-tuned chat variants, these models offer key advancements such as enhanced human-aligned responses, stronger multilingual support across varied languages, and an extended context length capability of up to 32,768 tokens. Designed for user convenience, the Qwen1.5 models integrate effortlessly with popular frameworks like Hugging Face Transformers, vLLM, and llama.cpp. Additionally, there is a specialized CodeQwen1.5 model focused on code generation with support for up to 64K tokens 913.
Specifications(11 models)
| Model | Released | Context | Parameters | Vision | Structured Outputs |
|---|---|---|---|---|---|
| Qwen-Max | 2024-05 | 128K | — | Yes | Yes |
| Qwen1.5-110B | 2024-04 | — | 110B | No | Yes |
| Qwen1.5-MoE-A2.7B | 2024-03 | — | 2.7B | No | No |
| Qwen1.5-72B | 2024-02 | — | 72B | No | Yes |
| Qwen1.5-32B | 2024-02 | — | 32B | No | Yes |
| Qwen1.5-14B | 2024-02 | — | 14B | No | No |
| Qwen1.5-7B | 2024-02 | — | 7B | No | Yes |
| Qwen1.5-4B | 2024-02 | — | 4B | No | Yes |
| Qwen1.5-1.8B | 2024-02 | — | 1.8B | No | Yes |
| Qwen1.5-0.5B | 2024-02 | — | 0.5B | No | Yes |
| DeepInfra Qwen1.5-72B-Chat | 2024-02 | 33K | 72B | No | Yes |
Available From(8 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Qwen1.5-7B | Replicate API | $0.05 | $0.25 | Serverless |
| Qwen1.5-4B | Replicate API | $0.05 | $0.25 | Serverless |
| Qwen1.5-1.8B | Replicate API | $0.05 | $0.25 | Serverless |
| Qwen1.5-0.5B | Replicate API | $0.05 | $0.25 | Serverless |
| Qwen1.5-0.5B | Together AI | $0.1 | $0.1 | Serverless |
| Qwen1.5-1.8B | Together AI | $0.1 | $0.1 | Serverless |
| Qwen1.5-4B | Together AI | $0.1 | $0.1 | Serverless |
| Qwen1.5-14B | Replicate API | $0.1 | $0.5 | Serverless |
| Qwen1.5-7B | Together AI | $0.2 | $0.2 | Serverless |
| Qwen1.5-32B | Replicate API | $0.2 | $1 | Serverless |
| DeepInfra Qwen1.5-72B-Chat | DeepInfra | $0.45 | $0.65 | Serverless |
| Qwen1.5-72B | Replicate API | $0.65 | $2.75 | Serverless |
| Qwen1.5-32B | Together AI | $0.8 | $0.8 | Serverless |
| Qwen1.5-72B | Fireworks AI | $0.9 | $0.9 | Provisioned |
| Qwen1.5-72B | Together AI | $0.9 | $0.9 | Serverless |
| Qwen-Max | OpenRouter | $1.04 | $4.16 | Serverless |
| Qwen1.5-110B | Microsoft Foundry | $1.5 | $2.5 | Provisioned |
| Qwen1.5-110B | Together AI | $1.8 | $1.8 | Serverless |
Frequently Asked Questions
- What is Qwen1.5?
- The Qwen1.5 family is an advanced series of large language models (LLMs) developed by Alibaba Cloud, serving as a beta precursor to the Qwen2 series 134. This collection includes eight model sizes, scaling from 0.5 billion to 72 billion parameters, and features a 14-billion parameter Mixture of Experts (MoE) model. Available in both base and fine-tuned chat variants, these models offer key advancements such as enhanced human-aligned responses, stronger multilingual support across varied languages, and an extended context length capability of up to 32,768 tokens. Designed for user convenience, the Qwen1.5 models integrate effortlessly with popular frameworks like Hugging Face Transformers, vLLM, and llama.cpp. Additionally, there is a specialized CodeQwen1.5 model focused on code generation with support for up to 64K tokens 913.
- How many models are in the Qwen1.5 family?
- The Qwen1.5 family contains 11 models.
- What is the latest Qwen1.5 model?
- The latest model is Qwen-Max, released in 2024-05.
- How much does Qwen1.5 cost?
- Qwen1.5 models range from $0.05/1M to $1.8/1M input tokens depending on the model and provider.
Models(11)
Qwen-Max
Qwen1.5-110B
Qwen1.5-MoE-A2.7B
Qwen1.5-72B
Qwen1.5-32B
Qwen1.5-14B
Qwen1.5-7B
Qwen1.5-4B
Qwen1.5-1.8B
Qwen1.5-0.5B
DeepInfra Qwen1.5-72B-Chat



