What is NVLM used for?

NVLM is used for coding. The family description and listed model capabilities point to those workloads as the best fit.

How does NVLM compare to NVIDIA Nemotron Nano 12B v2 VL?

NVLM by NVIDIA AI is strongest where you need coding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. NVLM has 6 listed variants and reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.

Which NVLM model should I use?

If price is the main constraint, use the pricing table first because NVLM does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NVLM-D 72B with 128k context.

NVLM Models by NVIDIA AI

NVIDIA AICC-BY-NC-4.0Open weights

6 models2024Up to 128k ctx

Details

ResearcherNVIDIA AI

LicenseCC-BY-NC-4.0

Commercial useCommercial use: non-commercial

Models6

Released2024

Max context128k

Links

Website HuggingFace

About

The NVLM 1.0 family consists of advanced multimodal large language models from NVIDIA, designed to excel in vision-language tasks. These models not only rival top-tier proprietary models like GPT-4o but also compare favorably with open-access models such as Llama 3-V 405B. Uniquely, NVLM 1.0 enhances text-only performance post multimodal training, contrary to many multimodal models that may degrade in text capabilities. Comprising three primary architectures—NVLM-D (decoder-only), NVLM-X (cross-attention-based), and NVLM-H (hybrid)—each setup aims to maximize different multimodal processing facets. NVIDIA supports open research by releasing the model weights and plans to share the training code. NVLM 1.0 excels in tasks like OCR, multimodal reasoning, and coding, showcasing extensive capabilities beyond traditional text-related tasks 1212.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

6 in view

NVLM-D 72BCurrent

Use when the workload needs 128k context and 72B parameters.

2024-09128k context72B parameters

NVLM-D 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters

NVLM-X 72BCurrent

Use when the workload needs 72B parameters.

2024-0972B parameters

NVLM-X 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters

NVLM-H 72BCurrent

Use when the workload needs 72B parameters.

2024-0972B parameters

NVLM-H 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters

Current NVLM variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
NVLM-D 72B	Use when the workload needs 128k context and 72B parameters.	2024-09	128k context72B parameters	Current
NVLM-D 34B	Use when the workload needs 34B parameters.	2024-09	34B parameters	Current
NVLM-X 72B	Use when the workload needs 72B parameters.	2024-09	72B parameters	Current
NVLM-X 34B	Use when the workload needs 34B parameters.	2024-09	34B parameters	Current
NVLM-H 72B	Use when the workload needs 72B parameters.	2024-09	72B parameters	Current
NVLM-H 34B	Use when the workload needs 34B parameters.	2024-09	34B parameters	Current

Release Timeline

1 release group

2024-09

6 current

NVLM-D 34B

34B parameters

Current

NVLM-D 72B

128k context72B parameters

Current

NVLM-H 34B

34B parameters

Current

NVLM-H 72B

72B parameters

Current

NVLM-X 34B

34B parameters

Current

NVLM-X 72B

72B parameters

Current

Specifications(6 models)

NVLM model specifications comparison
Model	Released	Context	Parameters
NVLM-D 72B	2024-09	128k	72B
NVLM-D 34B	2024-09	—	34B
NVLM-X 72B	2024-09	—	72B
NVLM-X 34B	2024-09	—	34B
NVLM-H 72B	2024-09	—	72B
NVLM-H 34B	2024-09	—	34B

Frequently Asked Questions

What is NVLM used for?: NVLM is used for coding. The family description and listed model capabilities point to those workloads as the best fit.
How does NVLM compare to NVIDIA Nemotron Nano 12B v2 VL?: NVLM by NVIDIA AI is strongest where you need coding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. NVLM has 6 listed variants and reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which NVLM model should I use?: If price is the main constraint, use the pricing table first because NVLM does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NVLM-D 72B with 128k context.