Llama 3.2 3B Instruct
Llama 3.2 3B Instruct is worth evaluating for rag, long context, and classification when its provider route and context window match the workload.
Use it for
- Teams evaluating rag, long context, and classification
- Workloads that can use a 128k context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Family
- Llama 3.2
- Released
- 2024-09-25
- Context
- 128k
- Parameters
- 3.21B
- Architecture
- Decoder Only
- Knowledge cutoff
- 2023-12
- Specialization
- general
- Training
- finetuned
Large-scale open-source AI for social technologies.
Cheapest of 7 routes · Novita AI
About
Llama 3.2 3B Instruct is Meta's Llama 3.2 model. It offers a 128K-token context window with weights openly available for self-hosting and scores 34.7 on MMLU PRO.
Llama 3.2 3B Instruct is an open-source model in the Llama 3.2 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through Cloudflare Workers AI, OpenRouter, Fireworks AI, and 4 more, with the cheapest tracked route listed at $0.03 input and $0.05 output per 1M tokens. Headline tracked benchmarks include BFCL 21.9 and MMLU PRO 34.7.
Top use-case fit
RAG
Included by capability and metadata signals in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Classification
Q/$ A1 relevant benchmark in the decision map.
Provider price ladder
Compare all 7Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Novita AI | $0.030 | $0.050 | Serverless |
| Fireworks AI | $0.100 | $0.100 | Serverless |
| AWS Bedrock | $0.150 | $0.150 | Serverless |
| Vercel AI Gateway | $0.150 | $0.150 | Serverless |
Capabilities
Benchmark peer barsfor Classification
Benchmark scores(2)
Migration checks
No linked migration route is available for this model yet.