Using Nemotron 3 Nano Omni on OpenRouter

Implementation guide · Nemotron 3 · NVIDIA AI

ServerlessOpen Weights

Quick Start

1
Create an account at OpenRouter and generate an API key.
2
Use the OpenRouter SDK or REST API to call nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free — see the documentation for request format.
3
You'll be billed Free/1M input, Free/1M output tokens. See full pricing.

API Portal Documentation Pricing Model Card

Code Examples

See OpenRouter documentation for integration details.

About OpenRouter

OpenRouter provides a unified interface for Large Language Models with better pricing, improved uptime, and no subscription requirements. Route across providers for cost optimization and reliability.

OpenRouter is a multi-provider LLM aggregator offering unified API access to 300+ models from all major labs and emerging providers, with automatic failover for reliability.

View all models on OpenRouter →

Pricing on OpenRouter

Type	Price (per 1M)
Input tokens	Free
Output tokens	Free

Capabilities

MultimodalAudio

About Nemotron 3 Nano Omni

NVIDIA Nemotron 3 Nano Omni is an open-weight 30B hybrid MoE multimodal model (3B active parameters) that natively accepts text, image, video, and audio inputs in a single inference loop. Built on a hybrid Mamba-Transformer architecture with 23 Mamba-2 layers, 23 MoE layers (128 experts, 6+1 active), and 6 GQA layers, plus Conv3D video layers and Efficient Video Sampling (EVS). Delivers up to 9x higher throughput than comparable omni models. Supports a 256K context window and a 16,384 reasoning budget. Open weights, datasets, and training recipes released under a permissive license.

Full model details →

Model Specs

Released2026-04-28

Parameters30B

Context262k

ArchitectureMoE + SSM Hybrid

More Models on OpenRouter

Nemotron 3 Super-120B-A12B Nemotron 3 Ultra

All models on OpenRouter →

Provider

OpenRouter

OpenRouter, Inc.

New York, NY, USA