Llama 3.1 405B Instruct
Llama 3.1 405B Instruct is worth evaluating for rag, long context, and classification when its provider route and context window match the workload.
Use it for
- Teams evaluating rag, long context, and classification
- Workloads that can use a 128k context window
- Buyers comparing 4 tracked provider routes
Do not use it for
- Vision or document-understanding workloads
- Family
- Llama 3.1
- Released
- 2024-07-23
- Context
- 128k
- Parameters
- 405B
- Architecture
- Decoder Only
- Knowledge cutoff
- 2023-12
- Specialization
- general
- Training
- finetuned
Large-scale open-source AI for social technologies.
Cheapest of 11 routes · AWS Bedrock
About
Llama 3.1 405B Instruct is Meta's advanced large language model released on July 23, 2024, featuring 405 billion parameters. It utilizes an optimized transformer architecture with supervised fine-tuning and reinforcement learning for enhanced instruction-following capabilities. The model supports multiple languages, was trained on 15 trillion tokens, and fine-tuned with 25 million synthetic examples. It excels in multilingual dialogue and text generation, making it ideal for assistant-like applications. Llama 3.1 incorporates robust safety measures and ethical considerations, outperforming many existing models on various industry benchmarks. AI engineers can access the model via its Hugging Face page for implementation in diverse NLP tasks.
Llama 3.1 405B Instruct is an open-source model in the Llama 3.1 family. The structured metadata tracks a 128k-token context window and structured outputs. This page tracks provider routes through OctoAI API (Deprecated), Together AI, Fireworks AI, and 8 more, with the cheapest tracked route listed at $2.4 input and $2.4 output per 1M tokens. Headline tracked benchmarks include Massive Multitask Language Understanding 88.6.
Top use-case fit
RAG
Included by capability and metadata signals in the decision map.
Long context
Included by capability and metadata signals in the decision map.
Classification
Q/$ D1 relevant benchmark in the decision map.
Provider price ladder
Compare all 11Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| AWS Bedrock | $2.40 | $2.40 | Serverless |
| Fireworks AI | $3.00 | $3.00 | Serverless |
| Hyperbolic AI Inference | $4.00 | $4.00 | Serverless |
| IBM watsonx | $3.00 | $9.00 | Serverless |
Capabilities
Benchmark peer barsfor Classification
Benchmark scores(1)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Massive Multitask Language Understanding | 88.6 | 5-shot | https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard |
Migration checks
No linked migration route is available for this model yet.