LLM Reference

Using Llama 4 Maverick 17B Instruct FP8 on Inceptron

Implementation guide · Llama 4 · AI at Meta

ServerlessOpen Weights

Quick Start

  1. 1
    Create an account at Inceptron and generate an API key.
  2. 2
    Use the Inceptron SDK or REST API to call llama-4-maverick-17b-128e-instruct-fp8.

Code Examples

Code examples for this provider have not been sourced yet.

About Inceptron

Inceptron provides AI inference acceleration hardware and software solutions for efficient model deployment.

Pricing on Inceptron

Capabilities

VisionMultimodalStructured Outputs

About Llama 4 Maverick 17B Instruct FP8

Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.

Model Specs

Released2025-04-05
Parameters400B (17B active)
Context1m
ArchitectureMixture of Experts
Knowledge cutoff2024-08

Provider

Inceptron