LLM Reference

Llama 3 Models by AI at Meta

AI at MetaLlama 3 CommunityOpen weightsHighlightOpen Source
11 models2024–2025Up to 8k ctxFrom $0.03/1M input

Details

ResearcherAI at Meta
Commercial useCommercial use with conditions
Models11
Released2024–2025
Max context8k

Capabilities

Function Calling1 of 11 models
Tool Use1 of 11 models
Structured Outputs7 of 11 models

About

Llama 3, developed by Meta AI and released in April 2024, represents a significant advancement in large language models (LLMs). Available in two configurations—8 billion and 70 billion parameters—the models offer both pretrained and instruction-tuned versions, enhancing their adaptability and effectiveness in dialogue scenarios. Llama 3 sets itself apart by being trained on over 15 trillion tokens of publicly available data, a massive expansion over its predecessor, Llama 2, and includes a substantial increase in code data. The models not only excel in performance but also incorporate robust safety features like Llama Guard 2 and Code Shield, underscoring Meta's focus on responsible AI use. Llama 3 models are accessible on platforms such as AWS, Google Cloud, and Hugging Face, with plans for future updates that will expand their capabilities to include multimodal functionalities and multilingual support.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

11 in view

Use when the workload needs 8k context, 8B parameters, and tool use.

2025-078k context8B parameterstool use

Use when the workload needs 8k context and 70B parameters.

2024-078k context70B parameters

Use when the workload needs 8k context, 70B parameters, and structured outputs.

2024-048k context70B parametersstructured outputs

Use when the workload needs 8k context, 8B parameters, and structured outputs.

2024-048k context8B parametersstructured outputs

Use when the workload needs 8k context and 70B parameters.

2024-048k context70B parameters
Llama 3 8BCurrent

Use when the workload needs 8k context and 8B parameters.

2024-048k context8B parameters

Use when the workload needs 8k context, 8B parameters, and structured outputs.

2024-048k context8B parametersstructured outputs

Use when the workload needs 8k context, 70B parameters, and structured outputs.

2024-048k context70B parametersstructured outputs

Use when the workload needs 8k context, 8B parameters, and structured outputs.

2024-048k context8B parametersstructured outputs

Use when the workload needs 8k context, 70B parameters, and structured outputs.

2024-048k context70B parametersstructured outputs

Use when the workload needs 8k context and 8B parameters.

2024-048k context8B parameters

Release Timeline

3 release groups
2025-07
1 current
Together AI - Llama 3 8B Lite
8k context8B parameterstool use
Current
2024-07
1 current
Llama 3 Taiwan 70B Instruct
8k context70B parameters
Current
2024-04
9 current
DeepInfra Llama 3 70B Instruct
8k context70B parametersstructured outputs
Current
DeepInfra Llama 3 8B Instruct
8k context8B parametersstructured outputs
Current
Fireworks Llama-3-8B-Instruct
8k context8B parameters
Current
Llama 3 70B
8k context70B parameters
Current
Llama 3 70B Instruct
8k context70B parametersstructured outputs
Current
Llama 3 8B
8k context8B parameters
Current
Llama 3 8B Instruct
8k context8B parametersstructured outputs
Current
Together AI Llama-3-70B-Instruct
8k context70B parametersstructured outputs
Current
Together AI Llama-3-8B-Instruct
8k context8B parametersstructured outputs
Current

Specifications(11 models)

Llama 3 model specifications comparison
ModelReleasedContextParametersFn CallingTool UseStructured Outputs
Together AI - Llama 3 8B Lite2025-078k8BYesYesYes
Llama 3 Taiwan 70B Instruct2024-078k70BNoNoNo
Llama 3 70B Instruct2024-048k70BNoNoYes
Llama 3 8B Instruct2024-048k8BNoNoYes
Llama 3 70B2024-048k70BNoNoNo
Llama 3 8B2024-048k8BNoNoNo
Together AI Llama-3-8B-Instruct2024-048k8BNoNoYes
Together AI Llama-3-70B-Instruct2024-048k70BNoNoYes
DeepInfra Llama 3 8B Instruct2024-048k8BNoNoYes
DeepInfra Llama 3 70B Instruct2024-048k70BNoNoYes
Fireworks Llama-3-8B-Instruct2024-048k8BNoNoNo

Pricing

Llama 3 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Llama 3 8B InstructOpenRouter$0.03$0.04Serverless
Llama 3 8B InstructNovita AI$0.04$0.04Serverless
Llama 3 8B InstructDeepInfra$0.05$0.15Serverless
DeepInfra Llama 3 8B InstructDeepInfra$0.05$0.15Serverless
Llama 3 8B InstructReplicate API$0.05$0.25Serverless
Llama 3 8BReplicate API$0.05$0.25Serverless
Llama 3 8B InstructLepton AI API$0.07$0.07Serverless
Together AI - Llama 3 8B LiteTogether AI$0.1$0.1Serverless
Llama 3 8B InstructGCP Vertex AI$0.12$0.36Serverless
Llama 3 8B InstructOctoAI API (Deprecated)$0.15$0.15Serverless
Fireworks Llama-3-8B-InstructFireworks AI$0.15$0.15Serverless
Llama 3 8B InstructTogether AI$0.18$0.18Serverless
Llama 3 8B InstructFireworks AI$0.2$0.2Serverless
Together AI Llama-3-8B-InstructTogether AI$0.2$0.2Serverless
Llama 3 8BFireworks AI$0.2$0.2Serverless
Llama 3 8B InstructAWS Bedrock$0.3$0.6Serverless
Llama 3 8B InstructMicrosoft Foundry$0.37$1.1Serverless
Llama 3 70B InstructHyperbolic AI Inference$0.4$0.4Serverless
Llama 3 70B InstructDeepInfra$0.45$0.65Serverless
DeepInfra Llama 3 70B InstructDeepInfra$0.45$0.65Serverless
Llama 3 70B InstructOpenRouter$0.51$0.74Serverless
Llama 3 70B InstructNovita AI$0.51$0.74Serverless
Llama 3 8B InstructIBM watsonx$0.6$0.6Serverless
Together AI Llama-3-70B-InstructTogether AI$0.6$0.75Serverless
Llama 3 70B InstructReplicate API$0.65$2.75Serverless
Llama 3 70BReplicate API$0.65$2.75Serverless
Llama 3 70B InstructLepton AI API$0.8$0.8Serverless
Llama 3 70B InstructTogether AI$0.88$0.88Serverless
Llama 3 70B InstructOctoAI API (Deprecated)$0.9$0.9Serverless
Llama 3 70B InstructFireworks AI$0.9$0.9Serverless
Llama 3 70B InstructAWS Bedrock$0.99$0.99Serverless
Llama 3 70B InstructDatabricks Foundation Model Serving$1$3Serverless
Llama 3 70B InstructGCP Vertex AI$1.2$3.6Serverless
Llama 3 70B InstructIBM watsonx$1.8$1.8Serverless
Llama 3 70B InstructMicrosoft Foundry$3.78$11.34Serverless

Frequently Asked Questions

What is Llama 3 used for?
Llama 3 is used for agent workflows and tool use, structured outputs, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does Llama 3 compare to MOSS-Audio?
Llama 3 by AI at Meta is strongest where you need agent workflows and tool use, while MOSS-Audio by MOSI AI is the closest related family to check for multimodal. Llama 3 has 11 listed variants and reaches up to 8k context, so compare the specs and pricing tables before choosing a production model.
Which Llama 3 model should I use?
For the lowest listed input price, start with Llama 3 8B Instruct through OpenRouter at $0.03/1M input tokens. For the most capable/latest local choice, evaluate Together AI - Llama 3 8B Lite with 8k context and tool use, function calling, and structured outputs.