Llama 4 Maverick 17B Instruct FP8
llama-4-maverick-17b-128e-instruct-fp8
Open Source
About
Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.
Llama 4 Maverick 17B Instruct FP8 has a 1M-token context window.
Llama 4 Maverick 17B Instruct FP8 input tokens at $0.15/1M, output at $0.6/1M.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning
Providers(7)
Compare all →| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| Microsoft Foundry | — | — | ServerlessProvisioned | |
| Together AI | $0.27 | $0.85 | Serverless | |
| OpenRouter | $0.15 | $0.6 | Serverless | |
| Fireworks AI | — | — | Serverless | |
| DeepInfra | $0.15 | $0.60 | Serverless | |
| GCP Vertex AI | $0.35 | $1.15 | Serverless | |
| NVIDIA NIM | — | — | Serverless |
Benchmark Scores(1)
| Benchmark | Score | Version | Source |
|---|---|---|---|
| τ-bench | 68.5 | τ-bench | https://taubench.com/ |
Rankings
Compare
All comparisons →Llama 4 Maverick 17B Instruct FP8 vs GPT-4o (08-06)Llama 4 Maverick 17B Instruct FP8 vs Claude Sonnet 4.6Llama 4 Maverick 17B Instruct FP8 vs Gemini 3.1 ProLlama 4 Maverick 17B Instruct FP8 vs DeepSeek V4 ProLlama 4 Maverick 17B Instruct FP8 vs DeepSeek V3Llama 4 Maverick 17B Instruct FP8 vs Grok 4Llama 4 Maverick 17B Instruct FP8 vs Qwen3.6-MaxLlama 4 Maverick 17B Instruct FP8 vs GPT-5Llama 4 Maverick 17B Instruct FP8 vs Llama 4 Scout 17B-16E InstructLlama 4 Maverick 17B Instruct FP8 vs Llama 3.3 70BLlama 4 Maverick 17B Instruct FP8 vs DeepSeek V4