LLM Reference

NVLM Models by NVIDIA AI

6 models2024Up to 128k ctx

About

The NVLM 1.0 family consists of advanced multimodal large language models from NVIDIA, designed to excel in vision-language tasks. These models not only rival top-tier proprietary models like GPT-4o but also compare favorably with open-access models such as Llama 3-V 405B. Uniquely, NVLM 1.0 enhances text-only performance post multimodal training, contrary to many multimodal models that may degrade in text capabilities. Comprising three primary architectures—NVLM-D (decoder-only), NVLM-X (cross-attention-based), and NVLM-H (hybrid)—each setup aims to maximize different multimodal processing facets. NVIDIA supports open research by releasing the model weights and plans to share the training code. NVLM 1.0 excels in tasks like OCR, multimodal reasoning, and coding, showcasing extensive capabilities beyond traditional text-related tasks 1212.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

6 in view
NVLM-D 72BCurrent

Use when the workload needs 128k context and 72B parameters.

2024-09128k context72B parameters
NVLM-D 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters
NVLM-X 72BCurrent

Use when the workload needs 72B parameters.

2024-0972B parameters
NVLM-X 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters
NVLM-H 72BCurrent

Use when the workload needs 72B parameters.

2024-0972B parameters
NVLM-H 34BCurrent

Use when the workload needs 34B parameters.

2024-0934B parameters

Release Timeline

1 release group
2024-09
6 current
NVLM-D 34B
34B parameters
Current
NVLM-D 72B
128k context72B parameters
Current
NVLM-H 34B
34B parameters
Current
NVLM-H 72B
72B parameters
Current
NVLM-X 34B
34B parameters
Current
NVLM-X 72B
72B parameters
Current

Specifications(6 models)

NVLM model specifications comparison
ModelReleasedContextParameters
NVLM-D 72B2024-09128k72B
NVLM-D 34B2024-0934B
NVLM-X 72B2024-0972B
NVLM-X 34B2024-0934B
NVLM-H 72B2024-0972B
NVLM-H 34B2024-0934B

Frequently Asked Questions

What is NVLM used for?
NVLM is used for coding. The family description and listed model capabilities point to those workloads as the best fit.
How does NVLM compare to NVIDIA Nemotron Nano 12B v2 VL?
NVLM by NVIDIA AI is strongest where you need coding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. NVLM has 6 listed variants and reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which NVLM model should I use?
If price is the main constraint, use the pricing table first because NVLM does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NVLM-D 72B with 128k context.

Models(6)