Llama 3.2 1B Instruct

Name: Llama 3.2 1B Instruct
Author: AI at Meta

Released

2024-09-25

Last refreshed

2026-07-09

Status

Researched 60d ago

Open weightsCommercial use: conditionalCodingRAGLong contextClassificationJSON / Tool use

Llama 3.2 1B Instruct is worth evaluating for coding, rag, and long context when its provider route and context window match the workload.

Use it for

Teams evaluating coding, rag, and long context
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Llama 3.2
Released: 2024-09-25
Context: 128k
Parameters: 1.23B
Architecture: Decoder Only
Knowledge cutoff: 2023-12
Specialization: general
Openness: Open weights
License: Llama 3 CommunityCommercial use: conditional
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.100

Input / 1M

$0.100

Cheapest of 7 routes · AWS Bedrock

Providers(7)

Cloudflare Workers AI OpenRouter Fireworks AI NVIDIA NIM Bitdeer AI AWS Bedrock Vercel AI Gateway

View 7 provider routes

About

Llama 3.2 1B Instruct is Meta's Llama 3.2 model. It offers a 128K-token context window with weights openly available for self-hosting and scores 25.6 on GPQA.

Llama 3.2 1B Instruct is an open-weight model in the Llama 3.2 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through Cloudflare Workers AI, OpenRouter, Fireworks AI, and 4 more, with the cheapest tracked route listed at $0.027 input and $0.2 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 25.6, HellaSwag 78.9, and HumanEval 28.1.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ B

1 relevant benchmark in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 7

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
AWS Bedrock	$0.100	$0.100	Serverless
Fireworks AI	$0.100	$0.100	Serverless
Vercel AI Gateway	$0.100	$0.100	Serverless
OpenRouter	$0.027	$0.200	Serverless

Available via routers & gateways(3)

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughFireworks AI

Amazon Bedrock Intelligent Prompt Routing

Router

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

PassthroughAWS Bedrock

NVIDIA LLM Router Blueprint

Router

NVIDIA's open-source AI blueprint for LLM routing that selects the optimal model per prompt via intent classification or neural auto-routing; being deprecated 2026-06-20.

Free OSSNVIDIA NIM

Capabilities

Structured Outputs

Benchmark peer barsfor Coding

HumanEvalRank 96 of 97

Claude Sonnet 4.6

98.0

96.7

Claude Opus 4.6

95.0

Grok-3

94.5

Llama 3.2 1B Instructcurrent

28.1

Benchmark scores(6)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
Google-Proof Q&A	25.6	diamondObserved 2026-03-06	—	Source
HellaSwag	78.9	10-shotObserved 2026-03-06	—	Source
HumanEval	28.1	pass@1Observed 2026-03-06	—	Source
Massive Multitask Language Understanding	49.3	5-shotObserved 2026-03-06	—	Source
BFCL	10.8	—Observed 2026-04-14	—	Source
MMLU PRO	20.0	—Observed 2026-04-14	—	Source

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(2)

Best Small Language Models (SLMs)Listed Cheapest LLM APIs You Can Call Right NowListed

Compare Llama 3.2 1B Instruct with other models

Comparison and alternatives

Browse all comparisons →

Show all 36 popular comparisonssorted by 7-day search impressions

Frequently asked questions

What is the context window of Llama 3.2 1B Instruct?

Llama 3.2 1B Instruct has a context window of 128k tokens.

How much does Llama 3.2 1B Instruct cost?

Llama 3.2 1B Instruct pricing ranges from $0.027/1M to $0.15/1M input tokens depending on the provider.

When was Llama 3.2 1B Instruct released?

Llama 3.2 1B Instruct was released on 2024-09-25.

Which providers offer Llama 3.2 1B Instruct?

Llama 3.2 1B Instruct is available from 7 providers: Cloudflare Workers AI, OpenRouter, Fireworks AI, NVIDIA NIM, Bitdeer AI, AWS Bedrock, Vercel AI Gateway.

What benchmarks has Llama 3.2 1B Instruct been tested on?

Llama 3.2 1B Instruct has been evaluated on 6 benchmarks, including Google-Proof Q&A, HellaSwag, HumanEval, Massive Multitask Language Understanding, BFCL.

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.100

Input / 1M

$0.100

Cheapest of 7 routes · AWS Bedrock

Providers(7)

Cloudflare Workers AI OpenRouter Fireworks AI NVIDIA NIM Bitdeer AI AWS Bedrock Vercel AI Gateway

View 7 provider routes