LLM Reference

Gemma 4 Models by Google DeepMind

Google DeepMindApache 2.0Open source
10 models2026Up to 256k ctxFrom $0.06/1M input

Details

ResearcherGoogle DeepMind
LicenseApache 2.0OSI-approved
Commercial useCommercial use: permitted
Models10
Released2026
Max context256k

Capabilities

Vision6 of 10 models
MultimodalAll models
Reasoning2 of 10 models
Function CallingAll models
Tool Use2 of 10 models
Structured Outputs6 of 10 models

About

Google's most capable open-source model family, purpose-built for advanced reasoning and agentic workflows. Delivered in five sizes (E2B, E4B, 12B dense, 26B MoE, 31B dense) with multimodal capabilities including text, image, video, and audio processing.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

10 in view

Use when the workload needs 256k context, 12B parameters, and reasoning.

2026-06256k context12B parametersreasoning

Use when the workload needs 256k context, 12B parameters, and reasoning.

2026-06256k context12B parametersreasoning

Use when the workload needs 128k context, 2B parameters, and function calling.

2026-03128k context2B parametersfunction calling

Use when the workload needs 128k context, 2B parameters, and function calling.

2026-03128k context2B parametersfunction calling

Use when the workload needs 128k context, 4B parameters, and function calling.

2026-03128k context4B parametersfunction calling

Use when the workload needs 128k context, 4B parameters, and function calling.

2026-03128k context4B parametersfunction calling

Use when the workload needs 256k context, 26B parameters, and function calling.

2026-03256k context26B parametersfunction calling

Use when the workload needs 256k context, 26B parameters, and function calling.

2026-03256k context26B parametersfunction calling

Use when the workload needs 256k context, 31B parameters, and function calling.

2026-03256k context31B parametersfunction calling

Use when the workload needs 256k context, 31B parameters, and function calling.

2026-03256k context31B parametersfunction calling

Release Timeline

2 release groups
2026-06
2 current
Gemma 4 12B
256k context12B parametersreasoning
Current
Gemma 4 12B IT
256k context12B parametersreasoning
Current
2026-03
8 current
Gemma 4 26B A4B
256k context26B parametersfunction calling
Current
Gemma 4 26B A4B IT
256k context26B parametersfunction calling
Current
Gemma 4 31B
256k context31B parametersfunction calling
Current
Gemma 4 31B IT
256k context31B parametersfunction calling
Current
Gemma 4 E2B
128k context2B parametersfunction calling
Current
Gemma 4 E2B IT
128k context2B parametersfunction calling
Current
Gemma 4 E4B
128k context4B parametersfunction calling
Current
Gemma 4 E4B IT
128k context4B parametersfunction calling
Current

Specifications(10 models)

Gemma 4 model specifications comparison
ModelReleasedContextParametersVisionMultimodalReasoningFn CallingTool UseStructured Outputs
Gemma 4 12B2026-06256k12BYesYesYesYesYesYes
Gemma 4 12B IT2026-06256k12BYesYesYesYesYesYes
Gemma 4 E2B2026-03128k2BNoYesNoYesNoNo
Gemma 4 E2B IT2026-03128k2BNoYesNoYesNoYes
Gemma 4 E4B2026-03128k4BNoYesNoYesNoNo
Gemma 4 E4B IT2026-03128k4BNoYesNoYesNoYes
Gemma 4 26B A4B2026-03256k26BYesYesNoYesNoNo
Gemma 4 26B A4B IT2026-03256k26BYesYesNoYesNoYes
Gemma 4 31B2026-03256k31BYesYesNoYesNoNo
Gemma 4 31B IT2026-03256k31BYesYesNoYesNoYes

Pricing

Gemma 4 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Gemma 4 26B A4B ITOpenRouter$0.06$0.33Serverless
Gemma 4 26B A4B ITCloudflare Workers AI$0.1$0.3Serverless
Gemma 4 31B ITOpenRouter$0.13$0.38Serverless
Gemma 4 26B A4B ITVercel AI Gateway$0.13$0.4Serverless
Gemma 4 26B A4B ITNovita AI$0.13$0.4Serverless
Gemma 4 26B A4B ITNextBit$0.13$0.4Serverless
Gemma 4 31BVercel AI Gateway$0.14$0.4Serverless
Gemma 4 31B ITNovita AI$0.14$0.4Serverless
Gemma 4 26B A4B ITGCP Vertex AI$0.15$0.6Serverless
Gemma 4 31B ITGCP Vertex AI$0.15$0.6Serverless
Gemma 4 31B ITTogether AI$0.39$0.97Serverless

Popular comparisons in this family

Frequently Asked Questions

What is Gemma 4 used for?
Gemma 4 is used for multimodal, vision and multimodal work, and reasoning. The family description and listed model capabilities point to those workloads as the best fit.
How does Gemma 4 compare to T5Gemma?
Gemma 4 by Google DeepMind is strongest where you need multimodal, while T5Gemma by Google DeepMind is the closest related family to check for agent workflows and tool use. Gemma 4 has 10 listed variants and reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
Which Gemma 4 model should I use?
For the lowest listed input price, start with Gemma 4 26B A4B IT through OpenRouter at $0.06/1M input tokens. For the most capable/latest local choice, evaluate Gemma 4 12B with 256k context and reasoning, tool use, function calling, structured outputs, and multimodal inputs.