LLM Reference

Llama 3.2 NV EmbedQA 1B v1

Released
2024-10-08
Last refreshed
2026-05-06
Status
Researched 37d ago
Open Weights

Llama 3.2 NV EmbedQA 1B v1 is worth evaluating for general LLM work when its provider route and context window match the workload.

Use it for

  • Teams evaluating general LLM work
  • Workloads that can use a 512 context window
  • Buyers comparing 1 tracked provider route

Do not use it for

  • Vision or document-understanding workloads
  • Strict JSON or tool-calling flows
Specifications
Family
NV-Embed
Released
2024-10-08
Context
512
Parameters
1B
Architecture
encoder
Specialization
embedding
Openness
Open weights
Training
pretrained
Created by

Accelerated AI for enterprise solutions

Santa Clara, California, United States
Founded 2015
Website
Pricing
Output / 1M
-
Input / 1M
-

Cheapest of 1 route · NVIDIA NIM

About

NVIDIA multilingual embedding model for question-answering retrieval, based on Llama 3.2. Outputs 2048-dimensional embeddings and supports 26 languages with 512-token context. Superseded by v2, which extends context to 4K tokens.

Llama 3.2 NV EmbedQA 1B v1 is an open-weight model in the NV-Embed family. The structured metadata tracks a 512-token context window. This page tracks provider routes through NVIDIA NIM. No headline benchmark score is tracked for Llama 3.2 NV EmbedQA 1B v1 yet.

Top use-case fit

No primary decision-task fit is mapped for this model yet.

Provider price ladder

Compare API pricing across 1 providers for input and output tokens, batch, and cached reads when available.

ProviderInput / 1MOutput / 1MRoute
NVIDIA NIM--
ServerlessPartial

Available via routers & gateways(1)

Capabilities

No model capability flags are currently sourced.

Benchmark peer barsfor Coding

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.