LLM Reference

Nemotron 3 Models by NVIDIA AI

9 models2024–2026Up to 1.05m ctxFrom $0.05/1M input

About

NVIDIA Nemotron 3 is the 2025-2026 open model family covering Nano 30B-A3B, Super 120B-A12B, Content Safety 4B, VoiceChat 12B, and Nano Omni variants for agentic reasoning, safety classification, and multimodal deployment.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

9 in view

Use when the workload needs omni, 262k context, and 30B parameters.

2026-04omni262k context30B parameters

Use when the workload needs moderation, 131k context, and 4B parameters.

2026-03moderation131k context4B parameters

Use when the workload needs 12B parameters and multimodal inputs.

2026-0312B parametersmultimodal inputs

Use when the workload needs 1.05m context, 120B parameters, and structured outputs.

2026-031.05m context120B parametersstructured outputs

Use when the workload needs 256k context, 4.0B parameters, and tool use.

2025-12256k context4.0B parameterstool use

Use when the workload needs structured outputs.

2025-12structured outputs

Use when the workload needs 128k context and 49B parameters.

2025-06128k context49B parameters

Use when the workload needs safety, 4k context, and 70B parameters.

2024-10safety4k context70B parameters

Use when the workload needs 128k context.

2024-09128k context

Release Timeline

6 release groups
2026-04
1 current
Nemotron 3 Nano Omni
omni262k context30B parameters
Current
2026-03
3 current
Nemotron 3 Content Safety
moderation131k context4B parameters
Current
Nemotron 3 Super-120B-A12B
1.05m context120B parametersstructured outputs
Current
Nemotron 3 VoiceChat
12B parametersmultimodal inputs
Current
2025-12
2 current
Nemotron 3 Nano
256k context4.0B parameterstool use
Current
Nemotron 3 Nano 30B-A3B
structured outputs
Current
2025-06
1 current
Llama 3.3 Nemotron Super 49B v1
128k context49B parameters
Current
2024-10
1 current
Llama 3.1 Nemotron 70B Reward
safety4k context70B parameters
Current
2024-09
1 current
Nemotron 3 Ultra
128k context
Current

Specifications(9 models)

Nemotron 3 model specifications comparison
ModelReleasedContextParametersVisionMultimodalFn CallingTool UseStructured Outputs
Nemotron 3 Nano Omni2026-04262k30BNoYesNoNoNo
Nemotron 3 Content Safety2026-03131k4BYesYesNoNoNo
Nemotron 3 VoiceChat2026-0312BYesYesNoNoNo
Nemotron 3 Super-120B-A12B2026-031.05m120BNoNoNoNoYes
Nemotron 3 Nano2025-12256k3.97BNoNoYesYesNo
Nemotron 3 Nano 30B-A3B2025-1230B (3B active)NoNoNoNoYes
Llama 3.3 Nemotron Super 49B v12025-06128k49BNoNoNoNoNo
Llama 3.1 Nemotron 70B Reward2024-104k70BNoNoNoNoNo
Nemotron 3 Ultra2024-09128k550B (55B active)NoNoNoNoNo

Available From(7 providers)

Pricing

Nemotron 3 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Nemotron 3 Nano 30B-A3BVercel AI Gateway$0.05$0.24Serverless
Nemotron 3 Nano 30B-A3BAWS Bedrock$0.06$0.24Serverless
Nemotron 3 Super-120B-A12BOpenRouter$0.09$0.45Serverless
Nemotron 3 Super-120B-A12BDeepInfra$0.1$0.5Serverless
Nemotron 3 Super-120B-A12BNVIDIA NIM$0.1$0.5Serverless
Nemotron 3 Super-120B-A12BVercel AI Gateway$0.15$0.65Serverless

Frequently Asked Questions

What is Nemotron 3 used for?
Nemotron 3 is used for omni, moderation, and safety. The family description and listed model capabilities point to those workloads as the best fit.
How does Nemotron 3 compare to NVIDIA Nemotron Nano 12B v2 VL?
Nemotron 3 by NVIDIA AI is strongest where you need omni, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. Nemotron 3 has 9 listed variants and reaches up to 1.05m context, so compare the specs and pricing tables before choosing a production model.
Which Nemotron 3 model should I use?
For the lowest listed input price, start with Nemotron 3 Nano 30B-A3B through Vercel AI Gateway at $0.05/1M input tokens. For the most capable/latest local choice, evaluate Nemotron 3 Nano with 256k context and tool use and function calling.

Models(9)