Using Llama 4 Maverick 17B Instruct FP8 on Novita AI

Implementation guide · Llama 4 · AI at Meta

ServerlessOpen Weights

Quick Start

1
Create an account at Novita AI and generate an API key.
2
Use the Novita AI SDK or REST API to call llama-4-maverick-17b-128e-instruct-fp8.
3
You'll be billed $0.27/1M input, $0.85/1M output tokens. See full pricing.

API Portal Pricing Model Card

Code Examples

Code examples for this provider have not been sourced yet.

About Novita AI

Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.

View all models on Novita AI →

Pricing on Novita AI

Type	Price (per 1M)
Input tokens	$0.27
Output tokens	$0.85

Capabilities

VisionMultimodalStructured Outputs

About Llama 4 Maverick 17B Instruct FP8

Meta's Llama 4 Maverick 17B with 128 experts, FP8-optimized for cost-efficient inference. Supports native Model Router integration on Microsoft Foundry.

Full model details →

Model Specs

Released2025-04-05

Parameters400B (17B active)

Context1m

ArchitectureMixture of Experts

Knowledge cutoff2024-08