LLM Reference

Using Ling-2.6-Flash on Novita AI

Implementation guide · Ling 2.6 · InclusionAI

Serverless

Quick Start

  1. 1
    Create an account at Novita AI and generate an API key.
  2. 2
    Use the Novita AI SDK or REST API to call ling-2.6-flash.
  3. 3
    You'll be billed $0.10/1M input, $0.30/1M output tokens. See full pricing.

Code Examples

Code examples for this provider have not been sourced yet.

About Novita AI

Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.

Pricing on Novita AI

TypePrice (per 1M)
Input tokens$0.10
Output tokens$0.30

Capabilities

Function CallingTool UseStructured Outputs

About Ling-2.6-Flash

InclusionAI's efficient 104B MoE instruct model with only 7.4B active parameters per token. Purpose-built for agentic workflows requiring fast responses and high token efficiency. Achieves 59.3% on GPQA Diamond. Nearly double the Artificial Analysis Intelligence Index score of comparable open-weight models. Available free on OpenRouter (inclusionai/ling-2.6-flash:free).

Model Specs

Released2026-04-21
Parameters104B (7.4B activated)
Context262K
Architecturemoe

Provider

Novita AI