Using Xiaomi MiMo-V2-Flash on Novita AI

Implementation guide · MiMo V2 · Xiaomi

Serverless

Quick Start

1
Create an account at Novita AI and generate an API key.
2
Use the Novita AI SDK or REST API to call MiMo-V2-Flash.
3
You'll be billed $0.10/1M input, $0.30/1M output tokens. See full pricing.

API Portal Pricing Model Card

Code Examples

Code examples for this provider have not been sourced yet.

About Novita AI

Novita AI offers a GPU-based inference API for image, video, and language model generation with a broad catalog of open-source models.

View all models on Novita AI →

Pricing on Novita AI

Type	Price (per 1M)
Input tokens	$0.10
Output tokens	$0.30

Capabilities

ReasoningFunction Calling

About Xiaomi MiMo-V2-Flash

MiMo-V2-Flash is Xiaomi's efficient open-source Mixture-of-Experts model, announced December 17, 2025 at Xiaomi's Human-Car-Home Ecosystem Partner Conference. It has 309B total parameters with 15B active, uses hybrid attention that interleaves Sliding Window Attention and Global Attention, and extends native 32K context to 256K. Multi-Token Prediction enables about 2.6x speculative decoding speedup. The model was distributed with weights on Hugging Face and ranked highly on SWE-Bench Verified and multilingual benchmarks at research time.

Full model details →

Model Specs

Released2025-12-17

Parameters309B

Context262k

ArchitectureMixture of Experts

Knowledge cutoff2024-12

Also available on(1)

Vercel AI Gateway$0.10/1M

Compare all providers →

Provider

Novita AI