Nemotron-Cascade Models by NVIDIA AI
1 model2026Up to 1.0M ctx
About
Cascade MoE reasoning models with superior performance on math and code tasks
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
1 in view
Nemotron-Cascade-2-30B-A3BCurrent
Use when the workload needs 1.0M context and 30B parameters.
2026-031.0M context30B parameters
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Nemotron-Cascade-2-30B-A3B | Use when the workload needs 1.0M context and 30B parameters. | 2026-03 | 1.0M context30B parameters | Current |
Release Timeline
1 release group2026-03
1 current
Nemotron-Cascade-2-30B-A3B
Current1.0M context30B parameters
Specifications(1 models)
| Model | Released | Context | Parameters |
|---|---|---|---|
| Nemotron-Cascade-2-30B-A3B | 2026-03 | 1M | 30B |
Frequently Asked Questions
- What is Nemotron-Cascade used for?
- Nemotron-Cascade is used for coding and math-heavy prompts. The family description and listed model capabilities point to those workloads as the best fit.
- How does Nemotron-Cascade compare to NVIDIA Nemotron Nano 12B v2 VL?
- Nemotron-Cascade by NVIDIA AI is strongest where you need coding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. Nemotron-Cascade has 1 listed variant and reaches up to 1.0M context, so compare the specs and pricing tables before choosing a production model.
- Which Nemotron-Cascade model should I use?
- If price is the main constraint, use the pricing table first because Nemotron-Cascade does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Nemotron-Cascade-2-30B-A3B with 1.0M context.




