LLM Reference
Hugging Face Inference Endpoints

Gemma 4 12B on Hugging Face Inference Endpoints

Gemma 4 · Google DeepMind

Open Source

Last refreshed 2026-06-03. Next refresh: weekly.

Why use Gemma 4 12B on Hugging Face Inference Endpoints?

Hugging Face Inference Endpoints offers Gemma 4 12B with competitive pricing. Hugging Face is a leading AI community and platform dedicated to democratizing artificial intelligence.

Compare Gemma 4 12B across 2 providers to find the best fit for your use case
Input / 1M
-
Output / 1M
-
Cache
Not sourced
Batch
Not sourced

Setup recipe

Docs fallback
Install
Use the provider REST API or SDK
Auth
Create a provider API key
Call
model: google/gemma-4-12B
Model ID
google/gemma-4-12B

Request example

Curated snippets for this provider are not sourced yet. Use Hugging Face Inference Endpoints documentation with model ID google/gemma-4-12B.

Gotchas

  • Use provider model ID "google/gemma-4-12B", not the LLMReference slug "gemma-4-12b".

Compare Gemma 4 12B Across Providers

ProviderInput (per 1M)Output (per 1M)
Hugging Face Inference Endpoints
Kaggle Models

Capabilities

VisionMultimodalReasoningFunction CallingTool UseAudio

About Gemma 4 12B

Base pre-trained 12B Gemma 4 model with an encoder-free unified multimodal architecture for text, image, video, and audio input. It supports a 256K context window and is intended for fine-tuning, research, and self-hosted local deployment in the gap between Gemma 4 E4B and the larger 26B MoE / 31B dense variants.

FAQ

What is the context window for Gemma 4 12B on Hugging Face Inference Endpoints?

Gemma 4 12B supports a 262k token context window on Hugging Face Inference Endpoints.

What API model ID do I use for Gemma 4 12B on Hugging Face Inference Endpoints?

Use the model ID google/gemma-4-12B when calling Hugging Face Inference Endpoints's API.

Who created Gemma 4 12B?

Gemma 4 12B was created by Google DeepMind as part of the Gemma 4 model family.

Is Gemma 4 12B open source?

Gemma 4 12B is open source according to the seed data.

Get Started

Model Specs

Released2026-06-03
Parameters11.9B
Context256k
Architectureencoder_free_unified_multimodal
Knowledge cutoff2025-01

Related Models on Hugging Face Inference Endpoints