DiffusionGemma Models by Google DeepMind
1 model2026Up to 256k ctx
Details
ResearcherGoogle DeepMind
LicenseApache 2.0(OSI)
Commercial useCommercial use allowed
Models1
Released2026
Max context256k
Capabilities
VisionAll models
ReasoningAll models
About
Google DeepMind's open-weight text-generation family that adapts the Gemma 4 26B A4B mixture-of-experts backbone for discrete diffusion. DiffusionGemma generates text by denoising 256-token canvases in parallel instead of decoding strictly left to right, while retaining multimodal text and image input support.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
1 in view
DiffusionGemma 26B A4B ITCurrent
Use when the workload needs 256k context, 26B parameters, and reasoning.
2026-06256k context26B parametersreasoning
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| DiffusionGemma 26B A4B IT | Use when the workload needs 256k context, 26B parameters, and reasoning. | 2026-06 | 256k context26B parametersreasoning | Current |
Release Timeline
1 release group2026-06
1 current
DiffusionGemma 26B A4B IT
Current256k context26B parametersreasoning
Specifications(1 models)
| Model | Released | Context | Parameters | Vision | Reasoning |
|---|---|---|---|---|---|
| DiffusionGemma 26B A4B IT | 2026-06 | 256k | 26B | Yes | Yes |
Frequently Asked Questions
- What is DiffusionGemma used for?
- DiffusionGemma is used for vision and multimodal work, reasoning, and coding. The family description and listed model capabilities point to those workloads as the best fit.
- How does DiffusionGemma compare to Gemma 4?
- DiffusionGemma by Google DeepMind is strongest where you need vision and multimodal work, while Gemma 4 by Google DeepMind is the closest related family to check for multimodal. DiffusionGemma has 1 listed variant and reaches up to 256k context, while Gemma 4 reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
- Which DiffusionGemma model should I use?
- If price is the main constraint, use the pricing table first because DiffusionGemma does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate DiffusionGemma 26B A4B IT with 256k context and reasoning and multimodal inputs.





