Llama 3.2 NV EmbedQA 1B v1
Llama 3.2 NV EmbedQA 1B v1 is worth evaluating for general LLM work when its provider route and context window match the workload.
Use it for
- Teams evaluating general LLM work
- Workloads that can use a 512 context window
- Buyers comparing 1 tracked provider route
Do not use it for
- Vision or document-understanding workloads
- Strict JSON or tool-calling flows
- Family
- NV-Embed
- Released
- 2024-10-08
- Context
- 512
- Parameters
- 1B
- Architecture
- encoder
- Specialization
- embedding
- Openness
- Open weights
- Training
- pretrained
Cheapest of 1 route · NVIDIA NIM
About
NVIDIA multilingual embedding model for question-answering retrieval, based on Llama 3.2. Outputs 2048-dimensional embeddings and supports 26 languages with 512-token context. Superseded by v2, which extends context to 4K tokens.
Llama 3.2 NV EmbedQA 1B v1 is an open-weight model in the NV-Embed family. The structured metadata tracks a 512-token context window. This page tracks provider routes through NVIDIA NIM. No headline benchmark score is tracked for Llama 3.2 NV EmbedQA 1B v1 yet.
Top use-case fit
No primary decision-task fit is mapped for this model yet.
Provider price ladder
Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| NVIDIA NIM | - | - | ServerlessPartial |
Available via routers & gateways(1)
Capabilities
No model capability flags are currently sourced.
Benchmark peer barsfor Coding
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
Comparison and alternatives
Browse all comparisons →Cheapest of 1 route · NVIDIA NIM