Llama 2 Models by AI at Meta
About
Llama 2, developed by Meta AI and released in July 2023, is a prominent family of large language models designed as an open-source alternative to proprietary chatbots. Its models, available in multiple sizes from 7 billion to 70 billion parameters, provide varying levels of accuracy balanced with computational efficiency. Llama 2 supports both research and commercial use, fostering greater accessibility and innovation in the AI community. Emphasizing safety and usefulness, the model employs techniques like reinforcement learning from human feedback (RLHF) and features specialized Llama 2-Chat models optimized for conversational applications. This makes it a versatile tool for various AI-driven tasks 1 2 3.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 4k context, 13B parameters, and structured outputs.
Use when the workload needs 4k context, 7B parameters, and structured outputs.
Use when the workload needs 4k context and 70B parameters.
Use when the workload needs 4k context and 13B parameters.
Use when the workload needs 4k context and 7B parameters.
Use when the workload needs 4k context and 34B parameters.
Use when the workload needs 4k context, 7B parameters, and structured outputs.
Use when the workload needs 4k context, 13B parameters, and structured outputs.
Use when the workload needs 4k context, 70B parameters, and structured outputs.
Use when the workload needs 4k context and 70B parameters.
Use when the workload needs 4k context, 70B parameters, and structured outputs.
Use when the workload needs 4k context and 70B parameters.
Use when the workload needs 4k context and 70B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Llama 2 13B Chat | Use when the workload needs 4k context, 13B parameters, and structured outputs. | 2023-07 | 4k context13B parametersstructured outputs | Current |
| Llama 2 7B Chat | Use when the workload needs 4k context, 7B parameters, and structured outputs. | 2023-07 | 4k context7B parametersstructured outputs | Current |
| Llama 2 70B | Use when the workload needs 4k context and 70B parameters. | 2023-07 | 4k context70B parameters | Current |
| Llama 2 13B | Use when the workload needs 4k context and 13B parameters. | 2023-07 | 4k context13B parameters | Current |
| Llama 2 7B | Use when the workload needs 4k context and 7B parameters. | 2023-07 | 4k context7B parameters | Current |
| Llama 2 34B (Unreleased) | Use when the workload needs 4k context and 34B parameters. | 2023-07 | 4k context34B parameters | Current |
| Together AI Llama-2-7B-chat | Use when the workload needs 4k context, 7B parameters, and structured outputs. | 2023-07 | 4k context7B parametersstructured outputs | Current |
| Together AI Llama-2-13B-chat | Use when the workload needs 4k context, 13B parameters, and structured outputs. | 2023-07 | 4k context13B parametersstructured outputs | Current |
| Together AI Llama-2-70B-chat | Use when the workload needs 4k context, 70B parameters, and structured outputs. | 2023-07 | 4k context70B parametersstructured outputs | Current |
| OctoML Llama-2-70b-chat | Use when the workload needs 4k context and 70B parameters. | 2023-07 | 4k context70B parameters | Current |
| Meta Llama 2 Chat 70B | Use when the workload needs 4k context, 70B parameters, and structured outputs. | 2023-07 | 4k context70B parametersstructured outputs | Current |
| Llama 2 70B Chat on IBM Watsonx | Use when the workload needs 4k context and 70B parameters. | 2023-07 | 4k context70B parameters | Current |
| Vultr Llama 2 70B | Use when the workload needs 4k context and 70B parameters. | 2023-07 | 4k context70B parameters | Current |
Release Timeline
1 release groupSpecifications(14 models)
| Model | Released | Context | Parameters | Structured Outputs |
|---|---|---|---|---|
| Llama 2 13B Chat | 2023-07 | 4k | 13B | Yes |
| Llama 2 7B Chat | 2023-07 | 4k | 7B | Yes |
| Llama 2 70B | 2023-07 | 4k | 70B | No |
| Llama 2 13B | 2023-07 | 4k | 13B | No |
| Llama 2 7B | 2023-07 | 4k | 7B | No |
| Llama 2 34B (Unreleased) | 2023-07 | 4k | 34B | No |
| Together AI Llama-2-7B-chat | 2023-07 | 4k | 7B | Yes |
| Together AI Llama-2-13B-chat | 2023-07 | 4k | 13B | Yes |
| Together AI Llama-2-70B-chat | 2023-07 | 4k | 70B | Yes |
| OctoML Llama-2-70b-chat | 2023-07 | 4k | 70B | No |
| Meta Llama 2 Chat 70B | 2023-07 | 4k | 70B | Yes |
| Llama 2 70B Chat on IBM Watsonx | 2023-07 | 4k | 70B | No |
| Vultr Llama 2 70B | 2023-07 | 4k | 70B | No |
Available From(17 providers)
Pricing
Frequently Asked Questions
- What is Llama 2 used for?
- Llama 2 is used for structured outputs and chatbot and role-playing use cases. The family description and listed model capabilities point to those workloads as the best fit.
- How does Llama 2 compare to MOSS-Audio?
- Llama 2 by AI at Meta is strongest where you need structured outputs, while MOSS-Audio by MOSI Intelligence is the closest related family to check for multimodal. Llama 2 has 14 listed variants and reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
- Which Llama 2 model should I use?
- For the lowest listed input price, start with Llama 2 7B Chat through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate Together AI Llama-2-7B-chat with 4k context and structured outputs.
Models(14)
Llama 2 13B Chat
Llama 2 7B Chat
Llama 2 70B
Llama 2 13B
Llama 2 7B
Llama 2 34B (Unreleased)
Together AI Llama-2-7B-chat
Together AI Llama-2-13B-chat
Together AI Llama-2-70B-chat
OctoML Llama-2-70b-chat
Meta Llama 2 Chat 70B
Llama 2 70B Chat on IBM Watsonx
Vultr Llama 2 70B



