Llama Guard Models by AI at Meta
About
The Llama Guard family of LLMs, developed by Meta AI, offers content safety classification capabilities for managing human-AI interactions. These models work by scrutinizing both inputs (prompts) and outputs (responses) to flag potentially unsafe content, utilizing a comprehensive safety risk taxonomy 14. Initially focused on text, the Llama Guard 3 Vision model extended this functionality to multimodal inputs, including image analysis 2. These models are known for their performance, which equals or surpasses current content moderation solutions on renowned benchmarks 1. Moreover, they are instruction-tuned, offering adaptability to various use cases and safety frameworks 14. Llama Guard models, including version 3-8B and its variants, are accessible via Hugging Face 4.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs safety, 164k context, and 12B parameters.
Use when the workload needs safety, 128k context, and 1B parameters.
Use when the workload needs safety, 128k context, and 1B parameters.
Use when the workload needs safety, 8k context, and 8B parameters.
Use when the workload needs safety, 8k context, and 8B parameters.
Use when the workload needs safety, 2k context, and 7B parameters.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Llama Guard 4 12B | Use when the workload needs safety, 164k context, and 12B parameters. | 2025-04 | safety164k context12B parameters | Current |
| Llama Guard 3 1B | Use when the workload needs safety, 128k context, and 1B parameters. | 2024-09 | safety128k context1B parameters | Current |
| Llama Guard 3 11B Vision | Use when the workload needs safety, 128k context, and 1B parameters. | 2024-09 | safety128k context1B parameters | Current |
| Llama Guard 3 8B | Use when the workload needs safety, 8k context, and 8B parameters. | 2024-07 | safety8k context8B parameters | Current |
| Llama Guard 2 8B | Use when the workload needs safety, 8k context, and 8B parameters. | 2024-04 | safety8k context8B parameters | Current |
| Llama Guard 7B | Use when the workload needs safety, 2k context, and 7B parameters. | 2023-12 | safety2k context7B parameters | Current |
Release Timeline
5 release groupsSpecifications(6 models)
| Model | Released | Context | Parameters | Vision | Structured Outputs |
|---|---|---|---|---|---|
| Llama Guard 4 12B | 2025-04 | 164k | 12B | No | Yes |
| Llama Guard 3 1B | 2024-09 | 128k | 1B | No | No |
| Llama Guard 3 11B Vision | 2024-09 | 128k | 1B | Yes | No |
| Llama Guard 3 8B | 2024-07 | 8k | 8B | No | Yes |
| Llama Guard 2 8B | 2024-04 | 8k | 8B | No | No |
| Llama Guard 7B | 2023-12 | 2k | 7B | No | Yes |
Available From(8 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Llama Guard 2 8B | Replicate API | $0.05 | $0.25 | Serverless |
| Llama Guard 3 1B | Fireworks AI | $0.1 | $0.1 | Serverless |
| Llama Guard 2 8B | OctoAI API (Deprecated) | $0.15 | $0.15 | Serverless |
| Llama Guard 4 12B | OpenRouter | $0.18 | $0.18 | Serverless |
| Llama Guard 2 8B | Fireworks AI | $0.2 | $0.2 | Provisioned |
| Llama Guard 7B | Together AI | $0.2 | $0.2 | Serverless |
| Llama Guard 7B | Fireworks AI | $0.2 | $0.2 | Provisioned |
| Llama Guard 3 8B | Fireworks AI | $0.2 | $0.2 | Serverless |
| Llama Guard 4 12B | Replicate API | $0.2 | $0.2 | Serverless |
| Llama Guard 3 8B | Replicate API | $0.3 | $0.3 | Serverless |
| Llama Guard 3 8B | Microsoft Foundry | $0.37 | $1.1 | Provisioned |
| Llama Guard 3 8B | OpenRouter | $0.48 | $0.03 | Serverless |
| Llama Guard 3 8B | Cloudflare Workers AI | $0.484 | $0.03 | Serverless |
Frequently Asked Questions
- What is Llama Guard used for?
- Llama Guard is used for safety, vision and multimodal work, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
- How does Llama Guard compare to Chameleon?
- Llama Guard by AI at Meta is strongest where you need safety, while Chameleon by AI at Meta is the closest related family to check for coding. Llama Guard has 6 listed variants and reaches up to 164k context, while Chameleon reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
- Which Llama Guard model should I use?
- For the lowest listed input price, start with Llama Guard 2 8B through Replicate API at $0.05/1M input tokens. For the most capable/latest local choice, evaluate Llama Guard 4 12B with 164k context and structured outputs.






