LLM Reference
NVIDIA NIM

Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM

NV-Embed · NVIDIA AI

ServerlessOpen Weights

Last refreshed 2026-05-06. Next refresh: weekly.

Why use Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM?

NVIDIA NIM offers Llama 3.2 NV EmbedQA 1B v1 with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.

Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Docs fallback
Install
Use the provider REST API or SDK
Auth
Create a provider API key
Call
model: nvidia/llama-3.2-nv-embedqa-1b-v1
Model ID
nvidia/llama-3.2-nv-embedqa-1b-v1

Request example

Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID nvidia/llama-3.2-nv-embedqa-1b-v1.

Gotchas

  • Use provider model ID "nvidia/llama-3.2-nv-embedqa-1b-v1", not the LLMReference slug "llama-3.2-nv-embedqa-1b-v1".

Pricing

TypeRate
GPU Hour Rate$1.00/GPU·hr
GPU Config1xH100

Capabilities

No model capability flags are currently sourced.

About Llama 3.2 NV EmbedQA 1B v1

NVIDIA multilingual embedding model for question-answering retrieval, based on Llama 3.2. Outputs 2048-dimensional embeddings and supports 26 languages with 512-token context. Superseded by v2, which extends context to 4K tokens.

FAQ

What is the context window for Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM?

Llama 3.2 NV EmbedQA 1B v1 supports a 512 token context window on NVIDIA NIM.

What API model ID do I use for Llama 3.2 NV EmbedQA 1B v1 on NVIDIA NIM?

Use the model ID nvidia/llama-3.2-nv-embedqa-1b-v1 when calling NVIDIA NIM's API.

Who created Llama 3.2 NV EmbedQA 1B v1?

Llama 3.2 NV EmbedQA 1B v1 was created by NVIDIA AI as part of the NV-Embed model family.

Is Llama 3.2 NV EmbedQA 1B v1 open source?

Llama 3.2 NV EmbedQA 1B v1 has open weights according to the seed data, but that does not necessarily mean an OSI-approved open-source license.

Get Started

Model Specs

Released2024-10-08
Parameters1B
Context512
Architectureencoder