LLM Reference

Llama 3.3 Models by AI at Meta

3 models2024–2025Up to 128K ctxFrom $0.1/1M input

About

Llama 3.3 is a family of 3 AI models by AI at Meta, released between 2024 and 2025.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

3 in view

Use when the workload needs 8K context, 70B parameters, and tool use.

2025-128K context70B parameterstool use

Use when the workload needs 128K context, 70B parameters, and structured outputs.

2025-09128K context70B parametersstructured outputs

Use when the workload needs 66K context, 70B parameters, and structured outputs.

2024-1266K context70B parametersstructured outputs

Release Timeline

3 release groups
2025-12
1 current
Llama 3.3 70B
8K context70B parameterstool use
Current
2025-09
1 current
Llama 3.3 70B Instruct
128K context70B parametersstructured outputs
Current
2024-12
1 current
Llama 3.3 70B Instruct (free)
66K context70B parametersstructured outputs
Current

Specifications(3 models)

Llama 3.3 model specifications comparison
ModelReleasedContextParametersVisionMultimodalFn CallingTool UseStructured Outputs
Llama 3.3 70B2025-128K70BYesYesYesYesNo
Llama 3.3 70B Instruct2025-09128k70BNoNoNoNoYes
Llama 3.3 70B Instruct (free)2024-1266K70BNoNoNoNoYes

Available From(11 providers)

Pricing

Llama 3.3 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Llama 3.3 70B Instruct (free)OpenRouter$0.1$0.32Serverless
Llama 3.3 70B Instruct (free)Novita AI$0.135$0.4Serverless
Llama 3.3 70B Instruct (free)Chutes AI$0.22$0.66Serverless
Llama 3.3 70B Instruct (free)Together AI$0.44$0.44Serverless
Llama 3.3 70B Instruct (free)GroqCloud$0.59$0.79Serverless
Llama 3.3 70B Instruct (free)Arcee AI$0.6$1.8Serverless
Llama 3.3 70B Instruct (free)Microsoft Foundry$0.71$0.71Serverless
Llama 3.3 70B Instruct (free)AWS Bedrock$0.72$0.72Serverless
Llama 3.3 70B Instruct (free)Vercel AI Gateway$0.72$0.72Serverless
Llama 3.3 70BFireworks AI$0.9$0.9Serverless
Llama 3.3 70B InstructAWS Bedrock$0.96$1.28Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is Llama 3.3 used for?
Llama 3.3 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does Llama 3.3 compare to Chameleon?
Llama 3.3 by AI at Meta is strongest where you need vision and multimodal work, while Chameleon by AI at Meta is the closest related family to check for coding. Llama 3.3 has 3 listed variants and reaches up to 128K context, while Chameleon reaches up to 4K context, so compare the specs and pricing tables before choosing a production model.
Which Llama 3.3 model should I use?
For the lowest listed input price, start with Llama 3.3 70B Instruct (free) through OpenRouter at $0.1/1M input tokens. For the most capable/latest local choice, evaluate Llama 3.3 70B with 8K context and tool use, function calling, and multimodal inputs.

Models(3)