Pixtral Models by MistralAI
About
Pixtral, developed by Mistral AI, is an innovative family of large language models (LLMs) that excels in multimodal AI by integrating both text and image processing capabilities. Built upon Mistral's successful text-only models, Pixtral introduces a vision encoder, enabling it to effectively tackle tasks like image captioning, visual question answering, and multimodal content generation 18. The models vary in size, balancing processing power and efficiency, and while some are available under specific free-use conditions, others require a commercial license. Its open-weight models promote collaboration and innovation within the research community 5.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 124B parameters and multimodal inputs.
Use when the workload needs 128k context, 124B parameters, and structured outputs.
Use when the workload needs 128k context, 12B parameters, and multimodal inputs.
Use when the workload needs 128k context, 12B parameters, and multimodal inputs.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Mistral Pixtral Large | Use when the workload needs 124B parameters and multimodal inputs. | 2024-12 | 124B parametersmultimodal inputs | Current |
| Pixtral Large | Use when the workload needs 128k context, 124B parameters, and structured outputs. | 2024-11 | 128k context124B parametersstructured outputs | Current |
| Pixtral 12B Instruct | Use when the workload needs 128k context, 12B parameters, and multimodal inputs. | 2024-09 | 128k context12B parametersmultimodal inputs | Current |
| Pixtral 12B Base | Use when the workload needs 128k context, 12B parameters, and multimodal inputs. | 2024-09 | 128k context12B parametersmultimodal inputs | Current |
Release Timeline
3 release groupsSpecifications(4 models)
| Model | Released | Context | Parameters | Vision | Multimodal | Structured Outputs |
|---|---|---|---|---|---|---|
| Mistral Pixtral Large | 2024-12 | — | 124B | No | Yes | No |
| Pixtral Large | 2024-11 | 128k | 124B | Yes | Yes | Yes |
| Pixtral 12B Instruct | 2024-09 | 128k | 12B | Yes | Yes | No |
| Pixtral 12B Base | 2024-09 | 128k | 12B | Yes | Yes | No |
Available From(4 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Pixtral 12B Instruct | Vercel AI Gateway | $0.15 | $0.15 | Serverless |
| Pixtral Large | Mistral AI Studio | $2 | $6 | Serverless |
| Pixtral Large | OpenRouter | $2 | $6 | Serverless |
| Mistral Pixtral Large | AWS Bedrock | $2 | $6 | Serverless |
| Pixtral Large | Vercel AI Gateway | $2 | $6 | Serverless |
Frequently Asked Questions
- What is Pixtral used for?
- Pixtral is used for vision and multimodal work, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.
- How does Pixtral compare to Ministral?
- Pixtral by MistralAI is strongest where you need vision and multimodal work, while Ministral by MistralAI is the closest related family to check for structured outputs. Pixtral has 4 listed variants and reaches up to 128k context, while Ministral reaches up to 32k context, so compare the specs and pricing tables before choosing a production model.
- Which Pixtral model should I use?
- For the lowest listed input price, start with Pixtral 12B Instruct through Vercel AI Gateway at $0.15/1M input tokens. For the most capable/latest local choice, evaluate Pixtral Large with 128k context and structured outputs and multimodal inputs.






