Nemotron 3 Nano
nemotron-3-nano
About
NVIDIA's lightweight 3.97B parameter model optimized for edge deployment with FP8 quantization (W8A8 mixed precision). Designed for agentic AI applications including gaming NPCs, local voice assistants, and IoT automation. Supports instruction following, tool use, and hallucination avoidance. Strong performance on BFCL, IFBench, IFEval, HaluEval, RULER, Tau2, AIME25, MATH500, GPQA-D, and LiveCodeBench.
Nemotron 3 Nano has a 256K-token context window.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning
Providers(1)
| Provider | Input (per 1M) | Output (per 1M) | Type | |
|---|---|---|---|---|
| NVIDIA NIM | — | — | Serverless |
Specifications
FamilyNemotron-3
Released2026-03-16
Parameters3.97B
Context256K
ArchitectureMixture of Experts
Specializationgeneral
LicenseApache 2.0
Trainingpretrained
Created by
Accelerated AI for enterprise solutions
Santa Clara, California, United States
Founded 2015
Website