What is NV-Embed used for?

NV-Embed is used for embedding, ranking, and coding. The family description and listed model capabilities point to those workloads as the best fit.

How does NV-Embed compare to NVIDIA Nemotron Nano 12B v2 VL?

NV-Embed by NVIDIA AI is strongest where you need embedding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. NV-Embed has 4 listed variants and reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.

Which NV-Embed model should I use?

If price is the main constraint, use the pricing table first because NV-Embed does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NV-EmbedCode 7B v1 with 4k context.

NV-Embed Models by NVIDIA AI

NVIDIA AI

4 models2024–2025Up to 4k ctx

Details

ResearcherNVIDIA AI

Models4

Released2024–2025

Max context4k

Links

Website

About

NV-Embed is NVIDIA's family of specialized embedding and reranking models, including NV-EmbedCode for code retrieval and NV-EmbedQA/RerankQA for question-answering tasks. NVIDIA NIM also hosts BAAI BGE models (such as BGE-M3) as first-class retrieval endpoints in its API catalog.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

4 in view

NV-EmbedCode 7B v1Current

Use when the workload needs embedding, 4k context, and 7B parameters.

2025-06embedding4k context7B parameters

Llama 3.2 NV EmbedQA 1B v2Current

Use when the workload needs embedding, 4k context, and 1B parameters.

2025-03embedding4k context1B parameters

Llama 3.2 NV RerankQA 1B v2Current

Use when the workload needs ranking, 4k context, and 1B parameters.

2025-03ranking4k context1B parameters

Llama 3.2 NV EmbedQA 1B v1Current

Use when the workload needs embedding, 512 context, and 1B parameters.

2024-10embedding512 context1B parameters

Current NV-Embed variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
NV-EmbedCode 7B v1	Use when the workload needs embedding, 4k context, and 7B parameters.	2025-06	embedding4k context7B parameters	Current
Llama 3.2 NV EmbedQA 1B v2	Use when the workload needs embedding, 4k context, and 1B parameters.	2025-03	embedding4k context1B parameters	Current
Llama 3.2 NV RerankQA 1B v2	Use when the workload needs ranking, 4k context, and 1B parameters.	2025-03	ranking4k context1B parameters	Current
Llama 3.2 NV EmbedQA 1B v1	Use when the workload needs embedding, 512 context, and 1B parameters.	2024-10	embedding512 context1B parameters	Current

Release Timeline

3 release groups

2025-06

1 current

NV-EmbedCode 7B v1

embedding4k context7B parameters

Current

2025-03

2 current

Llama 3.2 NV EmbedQA 1B v2

embedding4k context1B parameters

Current

Llama 3.2 NV RerankQA 1B v2

ranking4k context1B parameters

Current

2024-10

1 current

Llama 3.2 NV EmbedQA 1B v1

embedding512 context1B parameters

Current

Specifications(4 models)

NV-Embed model specifications comparison
Model	Released	Context	Parameters
NV-EmbedCode 7B v1	2025-06	4k	7B
Llama 3.2 NV EmbedQA 1B v2	2025-03	4k	1B
Llama 3.2 NV RerankQA 1B v2	2025-03	4k	1B
Llama 3.2 NV EmbedQA 1B v1	2024-10	512	1B

Available From(1 provider)

NVIDIA NIM

Popular comparisons in this family

Frequently Asked Questions

What is NV-Embed used for?: NV-Embed is used for embedding, ranking, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does NV-Embed compare to NVIDIA Nemotron Nano 12B v2 VL?: NV-Embed by NVIDIA AI is strongest where you need embedding, while NVIDIA Nemotron Nano 12B v2 VL by NVIDIA AI is the closest related family to check for structured outputs. NV-Embed has 4 listed variants and reaches up to 4k context, so compare the specs and pricing tables before choosing a production model.
Which NV-Embed model should I use?: If price is the main constraint, use the pricing table first because NV-Embed does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NV-EmbedCode 7B v1 with 4k context.