Using Llama 4 Maverick 17B Instruct FP8 on Inceptron

Implementation guide · Llama 4 · AI at Meta

ServerlessOpen Weights

Quick Start

1
Create an account at Inceptron and generate an API key.
2
Use the Inceptron SDK or REST API to call llama-4-maverick-17b-128e-instruct-fp8.

API Portal Model Card

Code Examples

Code examples for this provider have not been sourced yet.

About Inceptron

Inceptron provides AI inference acceleration hardware and software solutions for efficient model deployment.

View all models on Inceptron →

Pricing on Inceptron

Capabilities

VisionMultimodalStructured Outputs

About Llama 4 Maverick 17B Instruct FP8

Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.

Full model details →

Model Specs

Released2025-04-05

Parameters400B (17B active)

Context1m

ArchitectureMixture of Experts

Knowledge cutoff2024-08

Also available on(10)

OpenRouter$0.15/1M DeepInfra$0.15/1M AWS Bedrock$0.24/1M

Compare all providers →

Provider

Inceptron