What is Pixtral used for?

Pixtral is used for vision and multimodal work, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.

How does Pixtral compare to Ministral?

Pixtral by MistralAI is strongest where you need vision and multimodal work, while Ministral by MistralAI is the closest related family to check for vision and multimodal work. Pixtral has 4 listed variants and reaches up to 128k context, while Ministral reaches up to 32k context, so compare the specs and pricing tables before choosing a production model.

Which Pixtral model should I use?

For the lowest listed input price, start with Pixtral 12B Instruct through Vercel AI Gateway at $0.15/1M input tokens. For the most capable/latest local choice, evaluate Pixtral Large with 128k context and structured outputs and multimodal inputs.

Pixtral Models by MistralAI

MistralAIApache 2.0Open source

4 models2024Up to 128k ctxFrom $0.15/1M input

Details

ResearcherMistralAI

LicenseApache 2.0OSI-approved

Commercial useCommercial use: permitted

Models4

Released2024

Max context128k

Capabilities

Vision3 of 4 models

MultimodalAll models

Structured Outputs1 of 4 models

Links

Website HuggingFace

About

Pixtral, developed by Mistral AI, is an innovative family of large language models (LLMs) that excels in multimodal AI by integrating both text and image processing capabilities. Built upon Mistral's successful text-only models, Pixtral introduces a vision encoder, enabling it to effectively tackle tasks like image captioning, visual question answering, and multimodal content generation 18. The models vary in size, balancing processing power and efficiency, and while some are available under specific free-use conditions, others require a commercial license. Its open-weight models promote collaboration and innovation within the research community 5.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

4 in view

Mistral Pixtral LargeCurrent

Use when the workload needs 124B parameters and multimodal inputs.

2024-12124B parametersmultimodal inputs

Pixtral LargeCurrent

Use when the workload needs 128k context, 124B parameters, and structured outputs.

2024-11128k context124B parametersstructured outputs

Pixtral 12B InstructCurrent

Use when the workload needs 128k context, 12B parameters, and multimodal inputs.

2024-09128k context12B parametersmultimodal inputs

Pixtral 12B BaseCurrent

Use when the workload needs 128k context, 12B parameters, and multimodal inputs.

2024-09128k context12B parametersmultimodal inputs

Current Pixtral variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Mistral Pixtral Large	Use when the workload needs 124B parameters and multimodal inputs.	2024-12	124B parametersmultimodal inputs	Current
Pixtral Large	Use when the workload needs 128k context, 124B parameters, and structured outputs.	2024-11	128k context124B parametersstructured outputs	Current
Pixtral 12B Instruct	Use when the workload needs 128k context, 12B parameters, and multimodal inputs.	2024-09	128k context12B parametersmultimodal inputs	Current
Pixtral 12B Base	Use when the workload needs 128k context, 12B parameters, and multimodal inputs.	2024-09	128k context12B parametersmultimodal inputs	Current

Release Timeline

3 release groups

2024-12

1 current

Mistral Pixtral Large

124B parametersmultimodal inputs

Current

2024-11

1 current

Pixtral Large

128k context124B parametersstructured outputs

Current

2024-09

2 current

Pixtral 12B Base

128k context12B parametersmultimodal inputs

Current

Pixtral 12B Instruct

128k context12B parametersmultimodal inputs

Current

Specifications(4 models)

Pixtral model specifications comparison
Model	Released	Context	Parameters	Vision	Multimodal	Structured Outputs
Mistral Pixtral Large	2024-12	—	124B	No	Yes	No
Pixtral Large	2024-11	128k	124B	Yes	Yes	Yes
Pixtral 12B Instruct	2024-09	128k	12B	Yes	Yes	No
Pixtral 12B Base	2024-09	128k	12B	Yes	Yes	No

Available From(4 providers)

Pricing

Pixtral model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
Pixtral 12B Instruct	Vercel AI Gateway	$0.15	$0.15	Serverless
Pixtral Large	Mistral AI Studio	$2	$6	Serverless
Pixtral Large	OpenRouter	$2	$6	Serverless
Mistral Pixtral Large	AWS Bedrock	$2	$6	Serverless
Pixtral Large	Vercel AI Gateway	$2	$6	Serverless

Frequently Asked Questions

What is Pixtral used for?: Pixtral is used for vision and multimodal work, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does Pixtral compare to Ministral?: Pixtral by MistralAI is strongest where you need vision and multimodal work, while Ministral by MistralAI is the closest related family to check for vision and multimodal work. Pixtral has 4 listed variants and reaches up to 128k context, while Ministral reaches up to 32k context, so compare the specs and pricing tables before choosing a production model.
Which Pixtral model should I use?: For the lowest listed input price, start with Pixtral 12B Instruct through Vercel AI Gateway at $0.15/1M input tokens. For the most capable/latest local choice, evaluate Pixtral Large with 128k context and structured outputs and multimodal inputs.

Models(4)

Mistral Pixtral Large

2024-12124B1 provider

MultimodalOpen Source

Pixtral Large

2024-11128k124B3 providers

MultimodalOpen Source

Pixtral 12B Instruct

2024-09128k12B1 provider

MultimodalOpen Source

Pixtral 12B Base

2024-09128k12B

MultimodalOpen Source

Pixtral Models by MistralAI

Details

Capabilities

Links

About

Current Variants

Release Timeline

Specifications(4 models)

Available From(4 providers)

Pricing

Frequently Asked Questions

Related Model Families

Models(4)