LLM ReferenceLLM Reference

Nemotron-Cascade Models by NVIDIA AI

1 model2026Up to 1.0M ctx

About

Cascade MoE reasoning models with superior performance on math and code tasks

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view

Use when the workload needs 1.0M context and 30B parameters.

2026-031.0M context30B parameters

Release Timeline

1 release group
2026-03
1 current
Nemotron-Cascade-2-30B-A3B
1.0M context30B parameters
Current

Specifications(1 models)

Nemotron-Cascade model specifications comparison
ModelReleasedContextParameters
Nemotron-Cascade-2-30B-A3B2026-031M30B

Frequently Asked Questions

What is Nemotron-Cascade used for?
Nemotron-Cascade is used for coding and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
How does Nemotron-Cascade compare to NVIDIA Nemotron Nano 12B v2 VL?
Nemotron-Cascade by NVIDIA AI is strongest where you need coding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. Nemotron-Cascade has 1 listed variant and reaches up to 1.0M context, so compare the specs and pricing tables before choosing a production model.
Which Nemotron-Cascade model should I use?
If price is the main constraint, use the pricing table first because Nemotron-Cascade does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Nemotron-Cascade-2-30B-A3B with 1.0M context.

Models(1)