LLM ReferenceLLM Reference

Llama 4 Scout 17B-16E Instruct

llama-4-scout-17b-16e-instruct

Open Source

About

Meta's Llama 4 Scout is a 17-billion parameter mixture-of-experts model with 16 expert routing. Optimized for efficient inference on edge and cloud environments with strong multi-turn conversation capabilities. Available on Cloudflare Workers AI.

Llama 4 Scout 17B-16E Instruct has a 328K-token context window.

Llama 4 Scout 17B-16E Instruct input tokens at $0.08/1M, output at $0.3/1M.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning

Providers(8)

Compare all →
ProviderInput (per 1M)Output (per 1M)Type
OpenRouter$0.08$0.3Serverless
Together AIServerless
Fireworks AIServerless
DeepInfra$0.08$0.30Serverless
GCP Vertex AI$0.20$0.65Serverless
NVIDIA NIMServerless
GroqCloud$0.11$0.34Serverless
AWS Bedrock$0.17$0.22Serverless

Benchmark Scores(1)

BenchmarkScoreVersionSource
τ-bench62.3τ-benchhttps://taubench.com/

Rankings

Specifications

FamilyLlama 4
Released2025-04-05
Parameters17B
Context328K
ArchitectureMixture of Experts
Specializationgeneral
Trainingpretrained
Fine-tuninginstruction-tuning

Created by

Large-scale open-source AI for social technologies.

Menlo Park, California, United States
Founded 2013
Website