Llama 3 8B Instruct
About
The Llama 3 8B Instruct model, released on April 18, 2024, is Meta's latest instruction-following language model with 8 billion parameters. It utilizes an auto-regressive transformer architecture with Grouped-Query Attention for improved scalability. Trained on over 15 trillion tokens and fine-tuned with 10 million human-annotated examples, it excels in dialogue and conversational tasks. The model outperforms its predecessors on industry benchmarks, scoring 68.4 on MMLU (5-shot). Designed for commercial and research applications, it prioritizes safety and helpfulness, making it suitable for chatbots, virtual assistants, and other interactive AI applications. For more details, visit the Hugging Face page [1].
Capabilities
MultimodalFunction CallingTool UseJSON Mode
Providers(18)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| AWS Bedrock | $0.3 | $0.6 | Serverless | |
| GroqCloud | $0.05 | $0.08 | Serverless | |
| deepinfra API | — | — | Serverless | |
| OctoAI API | $0.15 | $0.15 | Serverless | |
| Fireworks AI Platform | $0.2 | $0.2 | Serverless | |
| Alibaba Cloud PAI-EAS | — | — | Serverless | |
| Baseten API | — | — | Serverless | |
| Lepton AI API | — | — | Serverless | |
| Replicate API | — | — | Serverless | |
| GCP Vertex AI | — | — | Serverless | |
| Snowflake Cortex | $0.38 | $0.38 | Serverless | |
| Cloudflare Workers AI | — | — | Serverless | |
| NVIDIA NIM | — | — | Provisioned | |
| Together AI API | $0.18 | $0.18 | Serverless | |
| Perplexity Labs | — | — | Serverless | |
| Databricks Foundation Model Serving | — | — | Provisioned | |
| IBM watsonx | $0.6 | $0.6 | Serverless | |
| Azure OpenAI | $0.37 | $1.1 | ServerlessProvisioned |
Benchmark Scores(4)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| Google-Proof Q&A | 44.8 | diamond | research |
| HellaSwag | 91.1 | 10-shot | research |
| HumanEval | 68.2 | pass@1 | research |
| Massive Multitask Language Understanding | 76.9 | 5-shot | research |
Specifications
FamilyLlama 3
Released2024-04-18
Parameters8B
Context8K
ArchitectureDecoder Only
Specializationgeneral