Llama 3.1 70B Instruct

Name: Llama 3.1 70B Instruct
Author: AI at Meta

Released

2024-07-23

Last refreshed

2026-07-11

Status

Researched 90d ago

Open weightsCommercial use: conditionalCodingRAGLong contextClassificationJSON / Tool use

Llama 3.1 70B Instruct is worth evaluating for coding, rag, and long context when its provider route and context window match the workload.

Use it for

Teams evaluating coding, rag, and long context
Workloads that can use a 128k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Vision or document-understanding workloads

Specifications

Family: Llama 3.1
Released: 2024-07-23
Context: 128k
Parameters: 70B
Architecture: Decoder Only
Knowledge cutoff: 2023-12
Specialization: general
Openness: Open weights
License: Llama 3 CommunityCommercial use: conditional
Weights: Available
Code: Unknown
Training: Fine-tuned

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.400

Input / 1M

$0.400

Cheapest of 13 routes · DeepInfra

Providers(13)

Cloudflare Workers AI OctoAI API (Deprecated)Together AI Fireworks AI NVIDIA NIM Microsoft Foundry Databricks Foundation Model Serving Hyperbolic AI Inference DeepInfra OpenRouter IBM watsonx AWS Bedrock Vercel AI Gateway

View 13 provider routes

About

The Llama 3.1 70B Instruct model is a cutting-edge large language model with 70 billion parameters, designed for instruction-following tasks. It features multilingual capabilities, supporting languages like English, German, French, and others. Fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), it excels in understanding and responding to user instructions. The model can handle a context length of up to 128k tokens, making it suitable for complex dialogue systems and applications requiring detailed responses. It outperforms many existing open-source and proprietary models on various industry benchmarks, making it ideal for conversational AI, content generation, and data synthesis tasks. For more details, visit the Hugging Face page [1].

Llama 3.1 70B Instruct is an open-weight model in the Llama 3.1 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through Cloudflare Workers AI, OctoAI API (Deprecated), Together AI, and 10 more, with the cheapest tracked route listed at $0.4 input and $0.4 output per 1M tokens. Headline tracked benchmarks include HellaSwag 94.2, HumanEval 84.1, and Massive Multitask Language Understanding 86.0.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ B

1 relevant benchmark in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

Compare all 13

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
DeepInfra	$0.400	$0.400	Serverless
Hyperbolic AI Inference	$0.400	$0.400	Serverless
OpenRouter	$0.400	$0.400	Serverless
AWS Bedrock	$0.720	$0.720	Serverless

Available via routers & gateways(8)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSMicrosoft Foundry

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughTogether AIFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionMicrosoft Foundry

Amazon Bedrock Intelligent Prompt Routing

Router

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

PassthroughAWS Bedrock

Azure AI Foundry Model Router

Router

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

PassthroughMicrosoft Foundry

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionMicrosoft Foundry

Capabilities

Structured Outputs

Benchmark peer barsfor Coding

HumanEvalRank 35 of 97

Claude Sonnet 4.6

98.0

96.7

Claude Opus 4.6

95.0

Grok-3

94.5

Llama 3.1 70B Instructcurrent

84.1

Benchmark scores(3)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
HellaSwag	94.2	10-shotObserved 2026-03-06	—	Source
HumanEval	84.1	pass@1Observed 2026-03-06	—	Source
Massive Multitask Language Understanding	86.0	5-shotObserved 2026-03-06	—	Source

Migration checks

No linked migration route is available for this model yet.

Rankings & picks(1)

Best LLMs for ClassificationListed

Compare Llama 3.1 70B Instruct with other models

Comparison and alternatives

Browse all comparisons →

Show all 27 popular comparisonssorted by 7-day search impressions

Frequently asked questions

What is the context window of Llama 3.1 70B Instruct?

Llama 3.1 70B Instruct has a context window of 128k tokens.

How much does Llama 3.1 70B Instruct cost?

Llama 3.1 70B Instruct pricing ranges from $0.40/1M to $2.68/1M input tokens depending on the provider.

When was Llama 3.1 70B Instruct released?

Llama 3.1 70B Instruct was released on 2024-07-23.

Which providers offer Llama 3.1 70B Instruct?

Llama 3.1 70B Instruct is available from 13 providers: Cloudflare Workers AI, OctoAI API (Deprecated), Together AI, Fireworks AI, NVIDIA NIM, Microsoft Foundry, Databricks Foundation Model Serving, Hyperbolic AI Inference, DeepInfra, OpenRouter, IBM watsonx, AWS Bedrock, Vercel AI Gateway.

What benchmarks has Llama 3.1 70B Instruct been tested on?

Llama 3.1 70B Instruct has been evaluated on 3 benchmarks, including HellaSwag, HumanEval, Massive Multitask Language Understanding.

Created by

AI at Meta

Large-scale open-source AI for social technologies.

Menlo Park, California, United States

Founded 2013

Website

Pricing

Output / 1M

$0.400

Input / 1M

$0.400

Cheapest of 13 routes · DeepInfra

Providers(13)

View 13 provider routes