Last refreshed 2026-06-03. Next refresh: weekly.
Why use Gemma 4 12B on Hugging Face Inference Endpoints?
Hugging Face Inference Endpoints offers Gemma 4 12B with competitive pricing. Hugging Face is a leading AI community and platform dedicated to democratizing artificial intelligence.
Compare Gemma 4 12B across 2 providers to find the best fit for your use caseSetup recipe
Docs fallbackUse the provider REST API or SDKCreate a provider API keymodel: google/gemma-4-12Bgoogle/gemma-4-12BRequest example
google/gemma-4-12B.Gotchas
- Use provider model ID "google/gemma-4-12B", not the LLMReference slug "gemma-4-12b".
Compare Gemma 4 12B Across Providers
| Provider | Input (per 1M) | Output (per 1M) |
|---|---|---|
| Hugging Face Inference Endpoints | — | — |
| Kaggle Models | — | — |
Capabilities
About Gemma 4 12B
Base pre-trained 12B Gemma 4 model with an encoder-free unified multimodal architecture for text, image, video, and audio input. It supports a 256K context window and is intended for fine-tuning, research, and self-hosted local deployment in the gap between Gemma 4 E4B and the larger 26B MoE / 31B dense variants.
FAQ
What is the context window for Gemma 4 12B on Hugging Face Inference Endpoints?
Gemma 4 12B supports a 262k token context window on Hugging Face Inference Endpoints.
What API model ID do I use for Gemma 4 12B on Hugging Face Inference Endpoints?
Use the model ID google/gemma-4-12B when calling Hugging Face Inference Endpoints's API.
Who created Gemma 4 12B?
Gemma 4 12B was created by Google DeepMind as part of the Gemma 4 model family.
Is Gemma 4 12B open source?
Gemma 4 12B is open source according to the seed data.