Last refreshed 2026-05-22. Next refresh: weekly.
Why use Gemma 3n 2B (free) on NVIDIA NIM?
NVIDIA NIM offers Gemma 3n 2B (free) with competitive pricing. NVIDIA NIM is NVIDIA's deployment platform for GPU-accelerated inference microservices.
Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced
Setup recipe
Docs fallbackInstall
Use the provider REST API or SDKAuth
Create a provider API keyCall
model: google/gemma-3n-e2b-itModel ID
google/gemma-3n-e2b-itRequest example
Curated snippets for this provider are not sourced yet. Use NVIDIA NIM documentation with model ID
google/gemma-3n-e2b-it.Gotchas
- Use provider model ID "google/gemma-3n-e2b-it", not the LLMReference slug "gemma-3n-e2b-it".
Pricing
| Type | Rate |
|---|---|
| GPU Hour Rate | $1.00/GPU·hr |
| GPU Config | 1xH100 |
Capabilities
No model capability flags are currently sourced.
About Gemma 3n 2B (free)
Google: Gemma 3n 2B (free) available via OpenRouter. Pricing: $null/1M input, $null/1M output.
FAQ
What is the context window for Gemma 3n 2B (free) on NVIDIA NIM?
Gemma 3n 2B (free) supports a 8k token context window on NVIDIA NIM.
What API model ID do I use for Gemma 3n 2B (free) on NVIDIA NIM?
Use the model ID google/gemma-3n-e2b-it when calling NVIDIA NIM's API.
Who created Gemma 3n 2B (free)?
Gemma 3n 2B (free) was created by Google DeepMind as part of the Gemma 3 model family.
Is Gemma 3n 2B (free) open source?
Gemma 3n 2B (free) has open weights under Gemma according to the seed data, but that does not necessarily mean an OSI-approved open-source license.
Model Specs
Released2025-04-03
Parameters5B (2B effective active)
Context8k
ArchitectureDecoder Only
Knowledge cutoff2024-06