LLM ReferenceLLM Reference
OpenRouter

Using Nemotron 3 Nano Omni on OpenRouter

Implementation guide · Nemotron-3 · NVIDIA AI

ServerlessOpen Source

Quick Start

  1. 1
    Create an account at OpenRouter and generate an API key.
  2. 2
    Use the OpenRouter SDK or REST API to call nvidia/nemotron-3-nano-omni-30b-a3b-reasoning:free — see the documentation for request format.
  3. 3
    You'll be billed Free/1M input, Free/1M output tokens. See full pricing.

Code Examples

See OpenRouter documentation for integration details.

About OpenRouter

OpenRouter provides a unified interface for Large Language Models with better pricing, improved uptime, and no subscription requirements. Route across providers for cost optimization and reliability.

Multi-provider LLM aggregator offering unified API access to 300+ models from all major labs and emerging providers, with automatic failover for reliability.

Pricing on OpenRouter

TypePrice (per 1M)
Input tokensFree
Output tokensFree

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

About Nemotron 3 Nano Omni

NVIDIA Nemotron 3 Nano Omni is an open-weight 30B hybrid MoE multimodal model (3B active parameters) that natively accepts text, image, video, and audio inputs in a single inference loop. Built on a hybrid Mamba-Transformer architecture with 23 Mamba-2 layers, 23 MoE layers (128 experts, 6+1 active), and 6 GQA layers, plus Conv3D video layers and Efficient Video Sampling (EVS). Delivers up to 9x higher throughput than comparable omni models. Supports a 256K context window and a 16,384 reasoning budget. Open weights, datasets, and training recipes released under a permissive license.

Model Specs

Released2026-04-28
Parameters30B
Context256k
ArchitectureHybrid Mamba-Transformer MoE

Provider

OpenRouter
OpenRouter

OpenRouter, Inc.

New York, NY, USA