Gemma 4 12B

Name: Gemma 4 12B
Author: Google DeepMind

Released

2026-06-03

Last refreshed

2026-06-29

Status

Researched 37d ago

Open sourceCommercial use: permittedMultimodalRAGAgentsLong contextVisionJSON / Tool useopen-sourcemultimodal

Gemma 4 12B is worth evaluating for rag, agents, and long context when its provider route and context window match the workload.

Use it for

Teams evaluating rag, agents, and long context
Workloads that can use a 256k context window
Buyers comparing 2 tracked provider routes

Do not use it for

Workloads where another current model has stronger sourced task evidence

Specifications

Family: Gemma 4
Released: 2026-06-03
Context: 256k
Parameters: 12B
Architecture: Decoder Only
Knowledge cutoff: 2025-01
Specialization: general
Openness: Open source
License: Apache 2.0OSI-approvedCommercial use: permitted
Weights: Available
Code: Unknown
Training: Pretrained

Created by

Google DeepMind

Pioneering artificial intelligence research.

London, United Kingdom

Founded 2014

Website

Pricing

Output / 1M

Input / 1M

Cheapest of 2 routes · Hugging Face Inference Endpoints

Providers(2)

Hugging Face Inference Endpoints Kaggle Models

View 2 provider routes

Links

Website HuggingFace

About

Google DeepMind's 12B open-weight multimodal model (Apache 2.0), designed to run on a 16GB laptop. First medium-sized model with native audio ingestion alongside text and image. Unified encoder-free decoder-only architecture. Supports 140+ languages. MMLU Pro: 77.2%.

Gemma 4 12B is an open-source model in the Gemma 4 family. The structured metadata tracks a 256k-token context window, multimodal input, audio, reasoning, function calling, tool use, and structured outputs. This page tracks provider routes through Hugging Face Inference Endpoints and Kaggle Models. No headline benchmark score is tracked for Gemma 4 12B yet.