Llama 3.1 70B Instruct
Llama 3.1 70B Instruct is worth evaluating for coding, rag, and long context when its provider route and context window match the workload.
Use it for
- Teams evaluating coding, rag, and long context
- Workloads that can use a 128k context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Family
- Llama 3.1
- Released
- 2024-07-23
- Context
- 128k
- Parameters
- 70B
- Architecture
- Decoder Only
- Knowledge cutoff
- 2023-12
- Specialization
- general
- Training
- finetuned
Large-scale open-source AI for social technologies.
Cheapest of 13 routes · DeepInfra
About
The Llama 3.1 70B Instruct model is a cutting-edge large language model with 70 billion parameters, designed for instruction-following tasks. It features multilingual capabilities, supporting languages like English, German, French, and others. Fine-tuned using supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF), it excels in understanding and responding to user instructions. The model can handle a context length of up to 128k tokens, making it suitable for complex dialogue systems and applications requiring detailed responses. It outperforms many existing open-source and proprietary models on various industry benchmarks, making it ideal for conversational AI, content generation, and data synthesis tasks. For more details, visit the Hugging Face page [1].
Llama 3.1 70B Instruct is an open-source model in the Llama 3.1 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through Cloudflare Workers AI, OctoAI API (Deprecated), Together AI, and 10 more, with the cheapest tracked route listed at $0.4 input and $0.4 output per 1M tokens. Headline tracked benchmarks include HellaSwag 94.2, HumanEval 84.1, and Massive Multitask Language Understanding 86.0.
Top use-case fit: coding, agents, and build tasks
Coding
Q/$ B1 relevant benchmark in the decision map.
RAG
Included by capability and metadata signals in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Provider price ladder
Compare all 13Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| DeepInfra | $0.400 | $0.400 | Serverless |
| Hyperbolic AI Inference | $0.400 | $0.400 | Serverless |
| OpenRouter | $0.400 | $0.400 | Serverless |
| AWS Bedrock | $0.720 | $0.720 | Serverless |
Capabilities
Benchmark peer barsfor Coding
Benchmark scores(3)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| HellaSwag | 94.2 | 10-shot | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| HumanEval | 84.1 | pass@1 | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
| Massive Multitask Language Understanding | 86.0 | 5-shot | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
Migration checks
No linked migration route is available for this model yet.