What is Grok Image used for?

Grok Image is used for video generation, image generation, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.

How does Grok Image compare to Grok 1?

Grok Image by xAI is strongest where you need video generation, while Grok 1 by xAI is the closest related family to check for reasoning. Grok Image has 6 listed variants and reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.

Which Grok Image model should I use?

If price is the main constraint, use the pricing table first because Grok Image does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Grok Imagine Video 1.5 Preview with multimodal inputs.

Grok Image Models by xAI

xAIProprietary

6 models2024–2026Up to 131k ctx

Details

ResearcherxAI

LicenseProprietary

Commercial useCommercial use: conditional

Models6

Released2024–2026

Max context131k

Capabilities

Vision4 of 6 models

MultimodalAll models

Links

Website

About

xAI's Imagine creative-generation family covers image and video API models under the Grok brand. The current seed tracks still-image generation, quality-tier image generation, and video generation, including Grok Imagine Video 1.5 Preview for short H.264 MP4 clips with synchronized audio.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

5 in view1 retired

Grok Imagine Video 1.5 PreviewCurrent

Use when the workload needs video generation, multimodal inputs, and audio.

2026-06video generationmultimodal inputsaudio

Grok Imagine Image QualityCurrent

Use when the workload needs image generation and multimodal inputs.

2026-05image generationmultimodal inputs

Grok ImagineCurrent

Use when the workload needs image generation, 131k context, and multimodal inputs.

2025-08image generation131k contextmultimodal inputs

Grok Imagine VideoCurrent

Use when the workload needs multimodal inputs.

2025-01multimodal inputs

Grok Imagine ImageCurrent

Use when the workload needs multimodal inputs.

2024-12multimodal inputs

Current Grok Image variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
Grok Imagine Video 1.5 Preview	Use when the workload needs video generation, multimodal inputs, and audio.	2026-06	video generationmultimodal inputsaudio	Current
Grok Imagine Image Quality	Use when the workload needs image generation and multimodal inputs.	2026-05	image generationmultimodal inputs	Current
Grok Imagine	Use when the workload needs image generation, 131k context, and multimodal inputs.	2025-08	image generation131k contextmultimodal inputs	Current
Grok Imagine Video	Use when the workload needs multimodal inputs.	2025-01	multimodal inputs	Current
Grok Imagine Image	Use when the workload needs multimodal inputs.	2024-12	multimodal inputs	Current

Release Timeline

5 release groups

2026-06

1 current

Grok Imagine Video 1.5 Preview

video generationmultimodal inputsaudio

Current

2026-05

1 current

Grok Imagine Image Quality

image generationmultimodal inputs

Current

2025-08

1 current

Grok Imagine

image generation131k contextmultimodal inputs

Current

2025-01

1 current

Grok Imagine Video

multimodal inputs

Current

2024-12

1 current · 1 retired

Grok Imagine Image

multimodal inputs

Current

Grok Imagine Image Pro

multimodal inputs

Replaced

Replaced By

Grok Imagine Image ProGrok Imagine Image Quality

Replaced

Keep for legacy integrations; evaluate Grok Imagine Image Quality before new work.

Specifications(6 models)

Grok Image model specifications comparison
Model	Released	Context	Vision	Multimodal
Grok Imagine Video 1.5 Preview	2026-06	—	Yes	Yes
Grok Imagine Image Quality	2026-05	—	Yes	Yes
Grok Imagine	2025-08	131k	No	Yes
Grok Imagine Video	2025-01	—	No	Yes
Grok Imagine Image	2024-12	—	Yes	Yes

Available From(2 providers)

Vercel AI Gateway

xAI Console

Frequently Asked Questions

What is Grok Image used for?: Grok Image is used for video generation, image generation, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
How does Grok Image compare to Grok 1?: Grok Image by xAI is strongest where you need video generation, while Grok 1 by xAI is the closest related family to check for reasoning. Grok Image has 6 listed variants and reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
Which Grok Image model should I use?: If price is the main constraint, use the pricing table first because Grok Image does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Grok Imagine Video 1.5 Preview with multimodal inputs.

Models(6)

Grok Imagine Video 1.5 Preview

2026-061 provider

Multimodal

Grok Imagine Image Quality

Grok Imagine

Grok Imagine Video

Grok Imagine Image

Grok Image Models by xAI

Details

Capabilities

Links

About

Current Variants

Release Timeline

Replaced By

Specifications(6 models)

Available From(2 providers)

Frequently Asked Questions

Related Model Families

Models(6)