Last refreshed 2026-05-19. Next refresh: weekly.
Why use Llama 3.2 NV EmbedQA 1B v2 on NVIDIA NIM?
NVIDIA NIM offers Llama 3.2 NV EmbedQA 1B v2 with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.
Setup recipe
Docs fallbackUse the provider REST API or SDKCreate a provider API keymodel: nvidia/llama-3.2-nv-embedqa-1b-v2nvidia/llama-3.2-nv-embedqa-1b-v2Request example
nvidia/llama-3.2-nv-embedqa-1b-v2.Gotchas
- Use provider model ID "nvidia/llama-3.2-nv-embedqa-1b-v2", not the LLMReference slug "llama-3.2-nv-embedqa-1b-v2".
Pricing
| Type | Rate |
|---|---|
| GPU Hour Rate | $1.00/GPU·hr |
| GPU Config | 1xH100 |
Capabilities
No model capability flags are currently sourced.
About Llama 3.2 NV EmbedQA 1B v2
Llama 3.2 NV EmbedQA 1B v2 is NVIDIA AI's NV-Embed model focused on text embeddings for retrieval and semantic search. It was released 2025-03-01.
FAQ
What is the context window for Llama 3.2 NV EmbedQA 1B v2 on NVIDIA NIM?
Llama 3.2 NV EmbedQA 1B v2 supports a 4k token context window on NVIDIA NIM.
What API model ID do I use for Llama 3.2 NV EmbedQA 1B v2 on NVIDIA NIM?
Use the model ID nvidia/llama-3.2-nv-embedqa-1b-v2 when calling NVIDIA NIM's API.
Who created Llama 3.2 NV EmbedQA 1B v2?
Llama 3.2 NV EmbedQA 1B v2 was created by NVIDIA AI as part of the NV-Embed model family.
Is Llama 3.2 NV EmbedQA 1B v2 open source?
Llama 3.2 NV EmbedQA 1B v2 has open weights according to the seed data, but that does not necessarily mean an OSI-approved open-source license.