What is Llama 4 used for?

Llama 4 is used for vision and multimodal work and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.

How does Llama 4 compare to Chameleon?

Llama 4 by AI at Meta is strongest where you need vision and multimodal work, while Chameleon by AI at Meta is the closest related family to check for coding. Llama 4 has 2 listed variants and reaches up to 10m context, while Chameleon reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.

Which Llama 4 model should I use?

Llama 4 Scout 17B-16E Instruct is both the lowest listed input-price option at $0.08/1M input tokens through DeepInfra and the strongest local starting point with 10m context and structured outputs and multimodal inputs. Use the provider table if latency, deployment type, or output-token pricing matters more than input price.

Llama 4 Models by AI at Meta

AI at MetaLlama 4 CommunityOpen weights

2 models2025Up to 10m ctxFrom $0.08/1M input

Details

ResearcherAI at Meta

LicenseLlama 4 Community

Commercial useCommercial use: conditional

Models2

Released2025

Max context10m

Capabilities

VisionAll models

MultimodalAll models

Structured OutputsAll models

Links

Website

About

Meta's Llama 4 family of large language models, featuring Mixture-of-Experts architectures for efficient inference.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

2 in view

Llama 4 Maverick 17B Instruct FP8Current

Use when the workload needs 1m context, structured outputs, and multimodal inputs.

2025-041m contextstructured outputsmultimodal inputs

Llama 4 Scout 17B-16E InstructCurrent

Use when the workload needs 10m context, structured outputs, and multimodal inputs.

2025-0410m contextstructured outputsmultimodal inputs

Current Llama 4 variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Llama 4 Maverick 17B Instruct FP8	Use when the workload needs 1m context, structured outputs, and multimodal inputs.	2025-04	1m contextstructured outputsmultimodal inputs	Current
Llama 4 Scout 17B-16E Instruct	Use when the workload needs 10m context, structured outputs, and multimodal inputs.	2025-04	10m contextstructured outputsmultimodal inputs	Current

Release Timeline

1 release group

2025-04

2 current

Llama 4 Maverick 17B Instruct FP8

1m contextstructured outputsmultimodal inputs

Current

Llama 4 Scout 17B-16E Instruct

10m contextstructured outputsmultimodal inputs

Current

Specifications(2 models)

Llama 4 model specifications comparison
Model	Released	Context	Parameters	Vision	Multimodal	Structured Outputs
Llama 4 Maverick 17B Instruct FP8	2025-04	1m	400B (17B active)	Yes	Yes	Yes
Llama 4 Scout 17B-16E Instruct	2025-04	10m	109B (17B active)	Yes	Yes	Yes

Available From(12 providers)

AWS Bedrock

Cloudflare Workers AI

Pricing

Llama 4 model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
Llama 4 Scout 17B-16E Instruct	OpenRouter	$0.08	$0.3	Serverless
Llama 4 Scout 17B-16E Instruct	DeepInfra	$0.08	$0.3	Serverless
Llama 4 Scout 17B-16E Instruct	GroqCloud	$0.11	$0.34	Serverless
Llama 4 Maverick 17B Instruct FP8	OpenRouter	$0.15	$0.6	Serverless
Llama 4 Maverick 17B Instruct FP8	DeepInfra	$0.15	$0.6	Serverless
Llama 4 Scout 17B-16E Instruct	AWS Bedrock	$0.17	$0.22	Serverless
Llama 4 Scout 17B-16E Instruct	Vercel AI Gateway	$0.17	$0.66	Serverless
Llama 4 Scout 17B-16E Instruct	Novita AI	$0.18	$0.59	Serverless
Llama 4 Scout 17B-16E Instruct	GCP Vertex AI	$0.2	$0.65	Serverless
Llama 4 Scout 17B-16E Instruct	Microsoft Foundry	$0.2	$0.78	Serverless
Llama 4 Maverick 17B Instruct FP8	AWS Bedrock	$0.24	$0.97	Serverless
Llama 4 Maverick 17B Instruct FP8	Vercel AI Gateway	$0.24	$0.97	Serverless
Llama 4 Scout 17B-16E Instruct	Cloudflare Workers AI	$0.27	$0.85	Serverless
Llama 4 Maverick 17B Instruct FP8	Together AI	$0.27	$0.85	Serverless
Llama 4 Maverick 17B Instruct FP8	Novita AI	$0.27	$0.85	Serverless
Llama 4 Maverick 17B Instruct FP8	Microsoft Foundry	$0.35	$1.41	Serverless
Llama 4 Maverick 17B Instruct FP8	GCP Vertex AI	$0.35	$1.15	Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is Llama 4 used for?: Llama 4 is used for vision and multimodal work and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does Llama 4 compare to Chameleon?: Llama 4 by AI at Meta is strongest where you need vision and multimodal work, while Chameleon by AI at Meta is the closest related family to check for coding. Llama 4 has 2 listed variants and reaches up to 10m context, while Chameleon reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which Llama 4 model should I use?: For the lowest listed input price, start with Llama 4 Scout 17B-16E Instruct through DeepInfra at $0.08/1M input tokens. For the most capable/latest local choice, evaluate Llama 4 Scout 17B-16E Instruct with 10m context and structured outputs and multimodal inputs.

Models(2)