Last refreshed 2026-06-03. Next refresh: weekly.
Why use Gemma 4 12B IT on Hugging Face Inference Endpoints?
Hugging Face Inference Endpoints offers Gemma 4 12B IT with competitive pricing. Hugging Face is a leading AI community and platform dedicated to democratizing artificial intelligence.
Compare Gemma 4 12B IT across 2 providers to find the best fit for your use caseSetup recipe
Docs fallbackUse the provider REST API or SDKCreate a provider API keymodel: google/gemma-4-12B-itgoogle/gemma-4-12B-itRequest example
google/gemma-4-12B-it.Gotchas
- Use provider model ID "google/gemma-4-12B-it", not the LLMReference slug "gemma-4-12b-it".
Compare Gemma 4 12B IT Across Providers
| Provider | Input (per 1M) | Output (per 1M) |
|---|---|---|
| Hugging Face Inference Endpoints | — | — |
| Kaggle Models | — | — |
Capabilities
About Gemma 4 12B IT
Instruction-tuned 12B Gemma 4 model with native text, image, video, and audio input through an encoder-free unified architecture. It runs on 16 GB VRAM in BF16, supports a 256K context window, configurable thinking mode, function calling, structured outputs, and 140+ languages, making it the mid-sized Gemma 4 option between E4B and the 26B MoE.
FAQ
What is the context window for Gemma 4 12B IT on Hugging Face Inference Endpoints?
Gemma 4 12B IT supports a 262k token context window on Hugging Face Inference Endpoints.
What API model ID do I use for Gemma 4 12B IT on Hugging Face Inference Endpoints?
Use the model ID google/gemma-4-12B-it when calling Hugging Face Inference Endpoints's API.
Who created Gemma 4 12B IT?
Gemma 4 12B IT was created by Google DeepMind as part of the Gemma 4 model family.
Is Gemma 4 12B IT open source?
Gemma 4 12B IT is open source according to the seed data.