Llama 3.2 11B Vision Instruct

Name: Llama 3.2 11B Vision Instruct
Author: AI at Meta

Released

2024-09-25

Last refreshed

2026-07-09

Status

Researched 47d ago

Open weightsCommercial use: conditionalMultimodalRAGLong contextVisionJSON / Tool use

Llama 3.2 11B Vision Instruct is worth evaluating for rag, long context, and vision when its provider route and context window match the workload.

Use it for

Teams evaluating rag, long context, and vision
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Workloads where another current model has stronger sourced task evidence

Specifications

Family: Llama 3.2
Released: 2024-09-25
Context: 128k
Parameters: 10.6B
Architecture: Decoder Only
Knowledge cutoff: 2024-03
Specialization: general
Openness: Open weights
License: Llama 3 CommunityCommercial use: conditional
Weights: Unknown
Code: Unknown
Training: Fine-tuned

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.160

Input / 1M

$0.160

Cheapest of 8 routes · Vercel AI Gateway

Providers(8)

Cloudflare Workers AI OpenRouter Fireworks AI NVIDIA NIM Bitdeer AI AWS Bedrock Microsoft Foundry Vercel AI Gateway

View 8 provider routes

About

Instruction-tuned 11B Llama 3.2 Vision model for image reasoning, visual question answering, document understanding, and captioning. NVIDIA NIM lists text plus image input, text output, and a 128K context window for the Llama 3.2 Vision collection.

Llama 3.2 11B Vision Instruct is Meta's entry-level multimodal model, released in September 2024 as part of the Llama 3.2 family. With 11 billion parameters, it was among the first openly available Meta models to accept image inputs alongside text, supporting a 128,000-token combined context window for text and image content. The model produces text-only output. NVIDIA NIM documents it as accepting text plus image input with text output within the Llama 3.2 Vision collection's shared context limit.

The instruction-tuned variant is fine-tuned for visual question answering, image captioning, document understanding, and figure interpretation in both single-turn and multi-turn conversational settings. It uses the same Llama 3 tokenizer and base architecture as the text-only Llama 3.2 models, extended with a vision encoder that projects image patches into the language model's embedding space.

Llama 3.2 11B Vision Instruct is available as open weights under Meta's Llama Community License and hosted on OpenRouter, Fireworks AI, NVIDIA NIM, AWS Bedrock, Azure AI Foundry, and Bitdeer. Teams needing stronger visual reasoning at the cost of higher compute should evaluate the Llama 3.2 90B Vision Instruct variant, which shares the same architecture and context window but has substantially more parameters.

Llama 3.2 11B Vision Instruct has a 128k-token context window.

Llama 3.2 11B Vision Instruct input tokens at $0.049/1M, output at $0.676/1M.

Top use-case fit

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 8

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
Vercel AI Gateway	$0.160	$0.160	Serverless
Fireworks AI	$0.200	$0.200	Serverless
OpenRouter	$0.245	$0.245	Serverless
AWS Bedrock	$0.200	$0.270	Serverless

Available via routers & gateways(8)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSMicrosoft Foundry

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionMicrosoft Foundry

Amazon Bedrock Intelligent Prompt Routing

Router

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

PassthroughAWS Bedrock

Azure AI Foundry Model Router

Router

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

PassthroughMicrosoft Foundry

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionMicrosoft Foundry

Capabilities

VisionMultimodalStructured Outputs

Benchmark peer barsfor RAG

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Frequently asked questions

What is the context window of Llama 3.2 11B Vision Instruct?

Llama 3.2 11B Vision Instruct has a context window of 128k tokens.

How much does Llama 3.2 11B Vision Instruct cost?

Llama 3.2 11B Vision Instruct pricing ranges from $0.049/1M to $0.37/1M input tokens depending on the provider.

When was Llama 3.2 11B Vision Instruct released?

Llama 3.2 11B Vision Instruct was released on 2024-09-25.

Which providers offer Llama 3.2 11B Vision Instruct?

Llama 3.2 11B Vision Instruct is available from 8 providers: Cloudflare Workers AI, OpenRouter, Fireworks AI, NVIDIA NIM, Bitdeer AI, AWS Bedrock, Microsoft Foundry, Vercel AI Gateway.

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.160

Input / 1M

$0.160

Cheapest of 8 routes · Vercel AI Gateway

Providers(8)

Cloudflare Workers AI OpenRouter Fireworks AI NVIDIA NIM Bitdeer AI AWS Bedrock Microsoft Foundry Vercel AI Gateway

View 8 provider routes