Gemma 4 Models by Google DeepMind
About
Google's most capable open-source model family, purpose-built for advanced reasoning and agentic workflows. Delivered in four sizes (E2B, E4B, 26B MoE, 31B dense) with multimodal capabilities including text, image, video, and audio processing for the smaller models.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
Use when the workload needs 128K context, 2B parameters, and function calling.
Use when the workload needs 128K context, 2B parameters, and function calling.
Use when the workload needs 128K context, 4B parameters, and function calling.
Use when the workload needs 128K context, 4B parameters, and function calling.
Use when the workload needs 256K context, 26B parameters, and function calling.
Use when the workload needs 256K context, 26B parameters, and function calling.
Use when the workload needs 256K context, 31B parameters, and function calling.
Use when the workload needs 256K context, 31B parameters, and function calling.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Gemma 4 E2B | Use when the workload needs 128K context, 2B parameters, and function calling. | 2026-03 | 128K context2B parametersfunction calling | Current |
| Gemma 4 E2B IT | Use when the workload needs 128K context, 2B parameters, and function calling. | 2026-03 | 128K context2B parametersfunction calling | Current |
| Gemma 4 E4B | Use when the workload needs 128K context, 4B parameters, and function calling. | 2026-03 | 128K context4B parametersfunction calling | Current |
| Gemma 4 E4B IT | Use when the workload needs 128K context, 4B parameters, and function calling. | 2026-03 | 128K context4B parametersfunction calling | Current |
| Gemma 4 26B A4B | Use when the workload needs 256K context, 26B parameters, and function calling. | 2026-03 | 256K context26B parametersfunction calling | Current |
| Gemma 4 26B A4B IT | Use when the workload needs 256K context, 26B parameters, and function calling. | 2026-03 | 256K context26B parametersfunction calling | Current |
| Gemma 4 31B | Use when the workload needs 256K context, 31B parameters, and function calling. | 2026-03 | 256K context31B parametersfunction calling | Current |
| Gemma 4 31B IT | Use when the workload needs 256K context, 31B parameters, and function calling. | 2026-03 | 256K context31B parametersfunction calling | Current |
Release Timeline
1 release groupSpecifications(8 models)
| Model | Released | Context | Parameters | Multimodal | Fn Calling | Structured Outputs |
|---|---|---|---|---|---|---|
| Gemma 4 E2B | 2026-03 | 128k | 2B | Yes | Yes | No |
| Gemma 4 E2B IT | 2026-03 | 128k | 2B | Yes | Yes | Yes |
| Gemma 4 E4B | 2026-03 | 128k | 4B | Yes | Yes | No |
| Gemma 4 E4B IT | 2026-03 | 128k | 4B | Yes | Yes | Yes |
| Gemma 4 26B A4B | 2026-03 | 256k | 26B | Yes | Yes | No |
| Gemma 4 26B A4B IT | 2026-03 | 256k | 26B | Yes | Yes | Yes |
| Gemma 4 31B | 2026-03 | 256k | 31B | Yes | Yes | No |
| Gemma 4 31B IT | 2026-03 | 256k | 31B | Yes | Yes | Yes |
Available From(4 providers)
Pricing
| Model | Provider | Input / 1M | Output / 1M | Type |
|---|---|---|---|---|
| Gemma 4 26B A4B IT | OpenRouter | $0.06 | $0.33 | Serverless |
| Gemma 4 31B IT | OpenRouter | $0.13 | $0.38 | Serverless |
| Gemma 4 26B A4B IT | GCP Vertex AI | $0.15 | $0.6 | Serverless |
| Gemma 4 31B IT | GCP Vertex AI | $0.15 | $0.6 | Serverless |
| Gemma 4 31B IT | Together AI | $0.2 | $0.5 | Serverless |
Frequently Asked Questions
- What is Gemma 4 used for?
- Gemma 4 is used for vision and multimodal work, agent workflows and tool use, and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
- How does Gemma 4 compare to Gemini 3?
- Gemma 4 by Google DeepMind is strongest where you need vision and multimodal work, while Gemini 3 by Google DeepMind is the closest related family to check for robotics. Gemma 4 has 8 listed variants and reaches up to 256K context, while Gemini 3 reaches up to 1M context, so compare the specs and pricing tables before choosing a production model.
- Which Gemma 4 model should I use?
- For the lowest listed input price, start with Gemma 4 26B A4B IT through OpenRouter at $0.06/1M input tokens. For the most capable/latest local choice, evaluate Gemma 4 26B A4B IT with 256K context and function calling, structured outputs, and multimodal inputs.
Models(8)
Gemma 4 E2B
Gemma 4 E2B IT
Gemma 4 E4B
Gemma 4 E4B IT
Gemma 4 26B A4B
Gemma 4 26B A4B IT
Gemma 4 31B
Gemma 4 31B IT






