Using Ling-2.6-Flash on Novita AI

Implementation guide · Ling 2.6 · InclusionAI

ServerlessOpen Source

Quick Start

1
Create an account at Novita AI and generate an API key.
2
Use the Novita AI SDK or REST API to call ling-2.6-flash.
3
You'll be billed $0.10/1M input, $0.30/1M output tokens. See full pricing.

API Portal Pricing Model Card

Code Examples

Code examples for this provider have not been sourced yet.

About Novita AI

Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.

View all models on Novita AI →

Pricing on Novita AI

Type	Price (per 1M)
Input tokens	$0.10
Output tokens	$0.30

Capabilities

Function CallingTool UseStructured Outputs

About Ling-2.6-Flash

InclusionAI's efficient 104B MoE instruct model with only 7.4B active parameters per token. Purpose-built for agentic workflows requiring fast responses and high token efficiency. Achieves 59.3% on GPQA Diamond. Nearly double the Artificial Analysis Intelligence Index score of comparable open-weight models. Available free on OpenRouter (inclusionai/ling-2.6-flash:free).

Full model details →

Model Specs

Released2026-04-21

Parameters104B (7.4B activated)

Context262k

ArchitectureMixture of Experts