LLM Reference

Nemotron-Labs TwoTower 30B-A3B Base

Released
2026-06-25
Last refreshed
2026-07-02
Status
Researched today
Open weightsCommercial use: permittedCodingLong contextClassificationOpen Source

Nemotron-Labs TwoTower 30B-A3B Base is a released coding, long context, and classification model with open-weight and 128k context; evaluate it while provider pricing coverage matures.

Use it for

  • Teams evaluating coding, long context, and classification
  • Workloads that can use a 128k context window

Do not use it for

  • Cost-sensitive launches that need sourced token pricing
  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Released
2026-06-25
Context
128k
Max output
128,000
Parameters
~60B total checkpoint; Hugging Face reports 63B params
Architecture
MoE + SSM Hybrid
Knowledge cutoff
2025-06
Specialization
general
Openness
Open weights
License
NVIDIA Open ModelCommercial use: permitted
Training
Pretrained
Created by

Accelerated AI for enterprise solutions

Santa Clara, California, United States
Founded 2015
Website
Pricing

No tracked provider token pricing is available yet.

About

Base text-generation checkpoint for Nemotron-Labs TwoTower. It uses a two-tower block-diffusion architecture over a Mamba-2/Transformer hybrid MoE backbone: a frozen causal AR/context tower processes clean prompt and committed tokens, while a trainable diffusion/denoiser tower fills token blocks by mask diffusion with cross-attention to the context tower. The checkpoint ships both towers and is not an instruction-tuned model.

Nemotron-Labs TwoTower 30B-A3B Base is an open-weight model in the Nemotron-Labs TwoTower family. The structured metadata tracks a 128k-token context window. Headline tracked benchmarks include Massive Multitask Language Understanding 78.2, MMLU PRO 60.9, and AI2 Reasoning Challenge 92.7.

Top use-case fit: coding, agents, and build tasks

Coding

1 relevant benchmark in the decision map.

Long context

Included by capability and metadata signals in the decision map.

Classification

2 relevant benchmarks in the decision map.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

Benchmark scores(10)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.
BenchmarkScoreVersionSource
Massive Multitask Language Understanding78.25-shot, accuracy; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
MMLU PRO60.95-shot, chain-of-thought exact match; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
AI2 Reasoning Challenge92.7ARC-Challenge, 25-shot, acc_norm; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
WinoGrande76.15-shot, accuracy; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
ReAding Comprehension Dataset From Examinations88.90-shot, accuracy; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
HumanEval75.60-shot; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
Mostly Basic Programming Problems74.3MBPP-Sanitized, 3-shot; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
Grade School Math 8K90.18-shot, accuracy; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
MATH-50080.64-shot; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations
Multilingual Grade School Math80.48-shot, average accuracy; default TwoTower diffusion decoding at confidence_threshold=0.8, block_size=16, BF16 on 2xH100; evaluator/harness not published on the model cardhttps://huggingface.co/nvidia/Nemotron-Labs-TwoTower-30B-A3B-Base-BF16#benchmark-evaluations

Migration checks

No linked migration route is available for this model yet.

API versions

v1.1v1.0

Frequently asked questions

What is the context window of Nemotron-Labs TwoTower 30B-A3B Base?

Nemotron-Labs TwoTower 30B-A3B Base has a context window of 128k tokens.

What is the max output of Nemotron-Labs TwoTower 30B-A3B Base?

Nemotron-Labs TwoTower 30B-A3B Base can generate up to 128,000 output tokens.

When was Nemotron-Labs TwoTower 30B-A3B Base released?

Nemotron-Labs TwoTower 30B-A3B Base was released on 2026-06-25.

What benchmarks has Nemotron-Labs TwoTower 30B-A3B Base been tested on?

Nemotron-Labs TwoTower 30B-A3B Base has been evaluated on 10 benchmarks, including Massive Multitask Language Understanding, MMLU PRO, AI2 Reasoning Challenge, WinoGrande, ReAding Comprehension Dataset From Examinations.