Using Llama 4 Maverick 17B Instruct FP8 on Inceptron
Implementation guide · Llama 4 · AI at Meta
ServerlessOpen Weights
Quick Start
- 1
- 2Use the Inceptron SDK or REST API to call
llama-4-maverick-17b-128e-instruct-fp8.
Code Examples
Code examples for this provider have not been sourced yet.
About Inceptron
Inceptron provides AI inference acceleration hardware and software solutions for efficient model deployment.
Pricing on Inceptron
Capabilities
VisionMultimodalStructured Outputs
About Llama 4 Maverick 17B Instruct FP8
Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.
Model Specs
Released2025-04-05
Parameters400B (17B active)
Context1m
ArchitectureMixture of Experts
Knowledge cutoff2024-08