Last refreshed 2026-05-19. Next refresh: weekly.
Why use Llama 3.2 1B Instruct on NVIDIA NIM?
NVIDIA NIM offers Llama 3.2 1B Instruct with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.
Compare Llama 3.2 1B Instruct across 7 providers to find the best fit for your use caseSetup recipe
Docs fallbackUse the provider REST API or SDKCreate a provider API keymodel: meta/llama-3.2-1b-instructmeta/llama-3.2-1b-instructRequest example
meta/llama-3.2-1b-instruct.Gotchas
- Use provider model ID "meta/llama-3.2-1b-instruct", not the LLMReference slug "llama-3.2-1b-instruct".
Compare Llama 3.2 1B Instruct Across Providers
| Provider | Input (per 1M) | Output (per 1M) |
|---|---|---|
| Cloudflare Workers AI | $0.03 | $0.20 |
| OpenRouter | $0.03 | $0.20 |
| Fireworks AI | $0.10 | $0.10 |
| NVIDIA NIM | — | — |
| Bitdeer AI | $0.15 | $0.45 |
Pricing
| Type | Rate |
|---|---|
| GPU Hour Rate | $1.00/GPU·hr |
| GPU Config | 1xH100 |
Capabilities
About Llama 3.2 1B Instruct
Llama 3.2 1B Instruct is Meta's Llama 3.2 model. It offers a 128K-token context window with weights openly available for self-hosting and scores 25.6 on GPQA.
FAQ
What is the context window for Llama 3.2 1B Instruct on NVIDIA NIM?
Llama 3.2 1B Instruct supports a 128k token context window on NVIDIA NIM.
How does NVIDIA NIM compare to other Llama 3.2 1B Instruct providers?
Llama 3.2 1B Instruct is available from 7 providers. The cheapest input pricing is $0.027/1M tokens from Cloudflare Workers AI.
What API model ID do I use for Llama 3.2 1B Instruct on NVIDIA NIM?
Use the model ID meta/llama-3.2-1b-instruct when calling NVIDIA NIM's API.
Who created Llama 3.2 1B Instruct?
Llama 3.2 1B Instruct was created by AI at Meta as part of the Llama 3.2 model family.
Is Llama 3.2 1B Instruct open source?
Llama 3.2 1B Instruct has open weights under Llama 3 Community according to the seed data, but that does not necessarily mean an OSI-approved open-source license.