LLM ReferenceLLM Reference

Imagen Models by Google DeepMind

Google DeepMindProprietary
9 models2024

About

Google's Imagen family of image generation models, capable of producing high-quality, photorealistic images from text prompts. Includes Imagen 3.0 and 4.0 variants for different speed/quality trade-offs.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

9 in view

Use when the workload needs image.

2024-11image
Imagen 4Current

Use when the workload needs image.

2024-11image

Use when the workload needs image.

2024-11image

Use when the workload needs image and multimodal inputs.

2024-10imagemultimodal inputs

Use when the workload needs image and multimodal inputs.

2024-09imagemultimodal inputs
Imagen 3Current

Use when the workload needs image.

2024-08image

Use when the workload needs image.

2024-08image

Use when the workload needs image and multimodal inputs.

2024-08imagemultimodal inputs
Imagen 3Current

Use when the workload needs image and multimodal inputs.

2024-08imagemultimodal inputs

Release Timeline

4 release groups
2024-11
3 current
Current
Current
Current
2024-10
1 current
Imagen Product Recontext
imagemultimodal inputs
Current
2024-09
1 current
Virtual Try-On
imagemultimodal inputs
Current
2024-08
4 current
Current
Imagen 3
imagemultimodal inputs
Current
Current
Current

Specifications(9 models)

Imagen model specifications comparison
ModelReleasedVisionMultimodal
Imagen 4 Ultra2024-11NoNo
Imagen 42024-11NoNo
Imagen 4 Fast2024-11NoNo
Imagen Product Recontext2024-10YesYes
Virtual Try-On2024-09YesYes
Imagen 32024-08NoNo
Imagen 3 Fast2024-08NoNo
Imagen 3 for Editing and Customization2024-08YesYes
Imagen 32024-08YesYes

Available From(2 providers)

Frequently Asked Questions

What is Imagen used for?
Imagen is used for image and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
How does Imagen compare to Gemma 4?
Imagen by Google DeepMind is strongest where you need image, while Gemma 4 by Google DeepMind is the closest related family to check for vision and multimodal work. Imagen has 9 listed variants, while Gemma 4 reaches up to 256K context, so compare the specs and pricing tables before choosing a production model.
Which Imagen model should I use?
If price is the main constraint, use the pricing table first because Imagen does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Imagen Product Recontext with multimodal inputs.

Models(9)