LLM Reference

DiffusionGemma Models by Google DeepMind

Google DeepMindApache 2.0Open source
1 model2026Up to 256k ctx

Details

ResearcherGoogle DeepMind
LicenseApache 2.0(OSI)
Commercial useCommercial use allowed
Models1
Released2026
Max context256k

Capabilities

VisionAll models
ReasoningAll models

About

Google DeepMind's open-weight text-generation family that adapts the Gemma 4 26B A4B mixture-of-experts backbone for discrete diffusion. DiffusionGemma generates text by denoising 256-token canvases in parallel instead of decoding strictly left to right, while retaining multimodal text and image input support.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

1 in view

Use when the workload needs 256k context, 26B parameters, and reasoning.

2026-06256k context26B parametersreasoning

Release Timeline

1 release group
2026-06
1 current
DiffusionGemma 26B A4B IT
256k context26B parametersreasoning
Current

Specifications(1 models)

DiffusionGemma model specifications comparison
ModelReleasedContextParametersVisionReasoning
DiffusionGemma 26B A4B IT2026-06256k26BYesYes

Frequently Asked Questions

What is DiffusionGemma used for?
DiffusionGemma is used for vision and multimodal work, reasoning, and coding. The family description and listed model capabilities point to those workloads as the best fit.
How does DiffusionGemma compare to Gemma 4?
DiffusionGemma by Google DeepMind is strongest where you need vision and multimodal work, while Gemma 4 by Google DeepMind is the closest related family to check for multimodal. DiffusionGemma has 1 listed variant and reaches up to 256k context, while Gemma 4 reaches up to 256k context, so compare the specs and pricing tables before choosing a production model.
Which DiffusionGemma model should I use?
If price is the main constraint, use the pricing table first because DiffusionGemma does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate DiffusionGemma 26B A4B IT with 256k context and reasoning and multimodal inputs.