LLM Reference

Llama 3 8B Instruct

Released
2024-04-18
Last refreshed
2026-05-22
Status
Researched 55d ago
Open WeightsCommercial use with conditionsCodingClassificationJSON / Tool use

Llama 3 8B Instruct is worth evaluating for coding, classification, and json / tool use when its provider route and context window match the workload.

Use it for

  • Teams evaluating coding, classification, and json / tool use
  • Workloads that can use a 8k context window
  • Buyers comparing 4 tracked provider routes

Do not use it for

  • Vision or document-understanding workloads
Specifications
Family
Llama 3
Released
2024-04-18
Context
8k
Parameters
8B
Architecture
Decoder Only
Knowledge cutoff
2023-03
Specialization
general
Openness
Open weights
License
Llama 3 CommunityCommercial use with conditions
Training
finetuned
Created by

Large-scale open-source AI for social technologies.

Menlo Park, California, United States
Founded 2013
Website
Pricing
Output / 1M
$0.040
Input / 1M
$0.030

Cheapest of 17 routes · OpenRouter

About

The Llama 3 8B Instruct model, released on April 18, 2024, is Meta's latest instruction-following language model with 8 billion parameters. It utilizes an auto-regressive transformer architecture with Grouped-Query Attention for improved scalability. Trained on over 15 trillion tokens and fine-tuned with 10 million human-annotated examples, it excels in dialogue and conversational tasks. The model outperforms its predecessors on industry benchmarks, scoring 68.4 on MMLU (5-shot). Designed for commercial and research applications, it prioritizes safety and helpfulness, making it suitable for chatbots, virtual assistants, and other interactive AI applications. For more details, visit the Hugging Face page [1].

Llama 3 8B Instruct is an open-weight model in the Llama 3 family. The structured metadata tracks a 8k-token context window and structured outputs. This page tracks provider routes through AWS Bedrock, DeepInfra, OctoAI API (Deprecated), and 14 more, with the cheapest tracked route listed at $0.03 input and $0.04 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 44.8, HellaSwag 91.1, and HumanEval 68.2.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ A

1 relevant benchmark in the decision map.

Classification

Q/$ A

3 relevant benchmarks in the decision map.

JSON / Tool use

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 17

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
OpenRouter$0.030$0.040
Serverless
Novita AI$0.040$0.040
Serverless
Lepton AI API$0.070$0.070
Serverless
DeepInfra$0.050$0.150
Serverless

Available via routers & gateways(16)

AIRouter

Router

Commercial LLM router that analyzes incoming requests and routes to the optimal model for cost/quality/latency via a drop-in OpenAI-compatible API, with a privacy-preserving embedding mode that avoids sending prompt content.

Passthrough + feeGCP Vertex AI

Amazon Bedrock Intelligent Prompt Routing

Router

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

PassthroughAWS Bedrock

Azure AI Foundry Model Router

Router

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

PassthroughMicrosoft Foundry

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionMicrosoft FoundryGCP Vertex AI

Kong AI Gateway

Gateway

Multi-LLM AI gateway built on Kong Gateway 3.x, adding semantic routing, load balancing, guardrails, and MCP traffic analytics as plugins over Kong's existing API management platform.

SubscriptionGCP Vertex AIMicrosoft Foundry

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSGCP Vertex AIMicrosoft Foundry

Capabilities

Structured Outputs

Benchmark peer barsfor Coding

Benchmark scores(7)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.
BenchmarkScoreVersionSource
Google-Proof Q&A44.8diamondhttps://arxiv.org/abs/2407.21783
HellaSwag91.110-shotresearch
HumanEval68.2pass@1research
Massive Multitask Language Understanding76.95-shothttps://arxiv.org/abs/2407.21783
Instruction-Following Evaluation59.5v2https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
MMLU PRO40.5https://huggingface.co/spaces/TIGER-Lab/MMLU-Pro
Grade School Math 8K80.6https://arxiv.org/abs/2407.21783

Migration checks

No linked migration route is available for this model yet.

Show all 51 popular comparisonssorted by 7-day search impressions
Llama 3 8B Instruct vs Gemma 2 9B SahabatAI Instruct81Llama 3 8B Instruct vs Qwen3.5-122B-A10B78Llama 3 8B Instruct vs Claude Opus 4.774Llama 3 8B Instruct vs Grok 474Llama 3 8B Instruct vs Phi 3.5 Mini Instruct73Llama 3 8B Instruct vs Mixtral 8x7B68Llama 3 8B Instruct vs Phi-4 Mini Flash Reasoning63Llama 3 8B Instruct vs Kimi K2.660Llama 3 8B Instruct vs Qwen3.5-35B-A3B59Llama 3 8B Instruct vs GLM-5.155Llama 3 8B Instruct vs Llama 2 13B Chat54Llama 3 8B Instruct vs DeepSeek R1 Distill Llama 70B53Llama 3 8B Instruct vs Gemma 7B Instruct50Llama 3 8B Instruct vs Qwen2.5-Max45Llama 3 8B Instruct vs Claude Opus 4.545Llama 3 8B Instruct vs Mistral Large 240Llama 3 8B Instruct vs Qwen2.5-72B40Llama 3 8B Instruct vs Claude Haiku 4.538Llama 3 8B Instruct vs Grok-337Llama 3 8B Instruct vs GPT-5.536Llama 3 8B Instruct vs Qwen3.5-397B-A17B30Llama 3 8B Instruct vs DeepSeek R1 052828Llama 3 8B Instruct vs ShieldGemma 9B27Llama 3 8B Instruct vs Xiaomi MiMo-V2.526Llama 3 8B Instruct vs Grok 4.2026Llama 3 8B Instruct vs GLM-5 Turbo25Llama 3 8B Instruct vs Claude 3.7 Sonnet25Llama 3 8B Instruct vs Kimi K2 Thinking22Llama 3 8B Instruct vs Claude Sonnet 4.518Llama 3 8B Instruct vs Step 3.5 Flash16Llama 3 8B Instruct vs GLM-5 9B16Llama 3 8B Instruct vs DeepSeek V3.115Llama 3 8B Instruct vs Grok 3 Mini15Llama 3 8B Instruct vs Llama 3.1 405B15Llama 3 8B Instruct vs GPT-5.413Llama 3 8B Instruct vs Qwen3-235B-A22B11Llama 3 8B Instruct vs Trinity-Large-Thinking11Llama 3 8B Instruct vs Kimi K2.511Llama 3 8B Instruct vs Gemini 2.5 Flash10Llama 3 8B Instruct vs GPT-5.29Llama 3 8B Instruct vs GLM-59Llama 3 8B Instruct vs Qwen2.5-72B-Instruct8Llama 3 8B Instruct vs Gemini 2.5 Pro Preview 05-067Llama 3 8B Instruct vs Kimi K2 Instruct6Llama 3 8B Instruct vs Gemini 2.5 Pro6Llama 3 8B Instruct vs DeepSeek V35Llama 3 8B Instruct vs Qwen2-7B-Instruct5Llama 3 8B Instruct vs Trinity-Large-Preview4Llama 3 8B Instruct vs GLM-5V-Turbo3Llama 3 8B Instruct vs o3 Mini0Llama 3 8B Instruct vs Qwen3.5-27B0