LLM ReferenceLLM Reference
NVIDIA NIM

Llama 3.1 Nemotron Nano 4B v1.1 on NVIDIA NIM

Nemotron Nano 2 · NVIDIA AI

Serverless

Last refreshed 2026-05-01. Next refresh: weekly.

Why use Llama 3.1 Nemotron Nano 4B v1.1 on NVIDIA NIM?

NVIDIA NIM offers Llama 3.1 Nemotron Nano 4B v1.1 with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.

Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Docs fallback
Install
Use the provider REST API or SDK
Auth
Create a provider API key
Call
model: nvidia/llama-3.1-nemotron-nano-4b-v1.1
Model ID
nvidia/llama-3.1-nemotron-nano-4b-v1.1

Request example

Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID nvidia/llama-3.1-nemotron-nano-4b-v1.1.

Gotchas

  • Use provider model ID "nvidia/llama-3.1-nemotron-nano-4b-v1.1", not the LLMReference slug "llama-3.1-nemotron-nano-4b-v1.1".

Pricing

TypeRate
GPU Hour Rate$1.00/GPU·hr

Capabilities

No model capability flags are currently sourced.

About Llama 3.1 Nemotron Nano 4B v1.1

Compact 4B parameter NVIDIA Nemotron Nano model for edge inference.

FAQ

What is the context window for Llama 3.1 Nemotron Nano 4B v1.1 on NVIDIA NIM?

Llama 3.1 Nemotron Nano 4B v1.1 supports a 4,000 token context window on NVIDIA NIM.

What API model ID do I use for Llama 3.1 Nemotron Nano 4B v1.1 on NVIDIA NIM?

Use the model ID nvidia/llama-3.1-nemotron-nano-4b-v1.1 when calling NVIDIA NIM's API.

Who created Llama 3.1 Nemotron Nano 4B v1.1?

Llama 3.1 Nemotron Nano 4B v1.1 was created by NVIDIA AI as part of the Nemotron Nano 2 model family.

Is Llama 3.1 Nemotron Nano 4B v1.1 open source?

Llama 3.1 Nemotron Nano 4B v1.1 is open source according to the seed data.

Get Started

Model Specs

Released2025-04-01
Parameters4B
Context4K
ArchitectureDecoder Only