Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM

Name: Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM
Brand: NVIDIA AI
SKU: llama-3.2-nv-embedqa-1b-v1-nvidia-nim

NV-Embed · NVIDIA AI

ServerlessOpen Weights

Last refreshed 2026-05-06. Next refresh: weekly.

Why use Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM?

NVIDIA NIM offers Llama 3.2 NV EmbedQA 1B v1 with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.

Input / 1M

Output / 1M

Cache

Not sourced

Batch

Not sourced

Setup recipe

Docs fallback

Install

Use the provider REST API or SDK

Auth

Create a provider API key

Call

model: nvidia/llama-3.2-nv-embedqa-1b-v1

Model ID

nvidia/llama-3.2-nv-embedqa-1b-v1

Request example

Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID nvidia/llama-3.2-nv-embedqa-1b-v1.

Gotchas

Use provider model ID "nvidia/llama-3.2-nv-embedqa-1b-v1", not the LLMReference slug "llama-3.2-nv-embedqa-1b-v1".

Capabilities

No model capability flags are currently sourced.

About Llama 3.2 NV EmbedQA 1B v1

NVIDIA multilingual embedding model for question-answering retrieval, based on Llama 3.2. Outputs 2048-dimensional embeddings and supports 26 languages with 512-token context. Superseded by v2, which extends context to 4K tokens.