llmreference

Gemma 4 Models by Google DeepMind

Google DeepMindApache 2.0
8 models2026Up to 256K ctxFrom $0.06/1M input

About

Google's most capable open-source model family, purpose-built for advanced reasoning and agentic workflows. Delivered in four sizes (E2B, E4B, 26B MoE, 31B dense) with multimodal capabilities including text, image, video, and audio processing for the smaller models.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

8 in view

Use when the workload needs 128K context, 2B parameters, and function calling.

2026-03128K context2B parametersfunction calling

Use when the workload needs 128K context, 2B parameters, and function calling.

2026-03128K context2B parametersfunction calling

Use when the workload needs 128K context, 4B parameters, and function calling.

2026-03128K context4B parametersfunction calling

Use when the workload needs 128K context, 4B parameters, and function calling.

2026-03128K context4B parametersfunction calling

Use when the workload needs 256K context, 26B parameters, and function calling.

2026-03256K context26B parametersfunction calling

Use when the workload needs 256K context, 26B parameters, and function calling.

2026-03256K context26B parametersfunction calling

Use when the workload needs 256K context, 31B parameters, and function calling.

2026-03256K context31B parametersfunction calling

Use when the workload needs 256K context, 31B parameters, and function calling.

2026-03256K context31B parametersfunction calling

Release Timeline

1 release group
2026-03
8 current
Gemma 4 26B A4B
256K context26B parametersfunction calling
Current
Gemma 4 26B A4B IT
256K context26B parametersfunction calling
Current
Gemma 4 31B
256K context31B parametersfunction calling
Current
Gemma 4 31B IT
256K context31B parametersfunction calling
Current
Gemma 4 E2B
128K context2B parametersfunction calling
Current
Gemma 4 E2B IT
128K context2B parametersfunction calling
Current
Gemma 4 E4B
128K context4B parametersfunction calling
Current
Gemma 4 E4B IT
128K context4B parametersfunction calling
Current

Specifications(8 models)

Gemma 4 model specifications comparison
ModelReleasedContextParametersMultimodalFn CallingStructured Outputs
Gemma 4 E2B2026-03128k2BYesYesNo
Gemma 4 E2B IT2026-03128k2BYesYesYes
Gemma 4 E4B2026-03128k4BYesYesNo
Gemma 4 E4B IT2026-03128k4BYesYesYes
Gemma 4 26B A4B2026-03256k26BYesYesNo
Gemma 4 26B A4B IT2026-03256k26BYesYesYes
Gemma 4 31B2026-03256k31BYesYesNo
Gemma 4 31B IT2026-03256k31BYesYesYes

Available From(4 providers)

Pricing

Gemma 4 model pricing by provider
ModelProviderInput / 1MOutput / 1MType
Gemma 4 26B A4B ITOpenRouter$0.06$0.33Serverless
Gemma 4 31B ITOpenRouter$0.13$0.38Serverless
Gemma 4 26B A4B ITGCP Vertex AI$0.15$0.6Serverless
Gemma 4 31B ITGCP Vertex AI$0.15$0.6Serverless
Gemma 4 31B ITTogether AI$0.2$0.5Serverless

Frequently Asked Questions

What is Gemma 4 used for?
Gemma 4 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does Gemma 4 compare to Gemini 3?
Gemma 4 by Google DeepMind is strongest where you need vision and multimodal work, while Gemini 3 by Google DeepMind is the closest related family to check for robotics. Gemma 4 has 8 listed variants and reaches up to 256K context, while Gemini 3 reaches up to 1M context, so compare the specs and pricing tables before choosing a production model.
Which Gemma 4 model should I use?
For the lowest listed input price, start with Gemma 4 26B A4B IT through OpenRouter at $0.06/1M input tokens. For the most capable/latest local choice, evaluate Gemma 4 26B A4B IT with 256K context and function calling, structured outputs, and multimodal inputs.

Models(8)