Grok Image Models by xAI
xAIProprietary
6 models2024–2026Up to 131k ctx
Details
ResearcherxAI
LicenseProprietary
Commercial useCommercial use: conditional
Models6
Released2024–2026
Max context131k
Capabilities
Vision4 of 6 models
MultimodalAll models
Links
WebsiteAbout
xAI's Imagine creative-generation family covers image and video API models under the Grok brand. The current seed tracks still-image generation, quality-tier image generation, and video generation, including Grok Imagine Video 1.5 Preview for short H.264 MP4 clips with synchronized audio.
Current Variants
Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.
5 in view1 retired
Use when the workload needs video generation, multimodal inputs, and audio.
2026-06video generationmultimodal inputsaudio
Grok Imagine Image QualityCurrent
Use when the workload needs image generation and multimodal inputs.
2026-05image generationmultimodal inputs
Grok ImagineCurrent
Use when the workload needs image generation, 131k context, and multimodal inputs.
2025-08image generation131k contextmultimodal inputs
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| Grok Imagine Video 1.5 Preview | Use when the workload needs video generation, multimodal inputs, and audio. | 2026-06 | video generationmultimodal inputsaudio | Current |
| Grok Imagine Image Quality | Use when the workload needs image generation and multimodal inputs. | 2026-05 | image generationmultimodal inputs | Current |
| Grok Imagine | Use when the workload needs image generation, 131k context, and multimodal inputs. | 2025-08 | image generation131k contextmultimodal inputs | Current |
| Grok Imagine Video | Use when the workload needs multimodal inputs. | 2025-01 | multimodal inputs | Current |
| Grok Imagine Image | Use when the workload needs multimodal inputs. | 2024-12 | multimodal inputs | Current |
Release Timeline
5 release groups2026-06
1 current
Grok Imagine Video 1.5 Preview
Currentvideo generationmultimodal inputsaudio
2026-05
1 current
Grok Imagine Image Quality
Currentimage generationmultimodal inputs
2025-08
1 current
Grok Imagine
Currentimage generation131k contextmultimodal inputs
2025-01
1 current
Grok Imagine Video
Currentmultimodal inputs
2024-12
1 current · 1 retired
Grok Imagine Image
Currentmultimodal inputs
Grok Imagine Image Pro
Replacedmultimodal inputs
Replaced By
Keep for legacy integrations; evaluate Grok Imagine Image Quality before new work.
Specifications(6 models)
| Model | Released | Context | Vision | Multimodal |
|---|---|---|---|---|
| Grok Imagine Video 1.5 Preview | 2026-06 | — | Yes | Yes |
| Grok Imagine Image Quality | 2026-05 | — | Yes | Yes |
| Grok Imagine | 2025-08 | 131k | No | Yes |
| Grok Imagine Video | 2025-01 | — | No | Yes |
| Grok Imagine Image | 2024-12 | — | Yes | Yes |
Available From(2 providers)
Frequently Asked Questions
- What is Grok Image used for?
- Grok Image is used for video generation, image generation, and vision and multimodal work. The family description and listed model capabilities point to those workloads as the best fit.
- How does Grok Image compare to Grok 1?
- Grok Image by xAI is strongest where you need video generation, while Grok 1 by xAI is the closest related family to check for reasoning. Grok Image has 6 listed variants and reaches up to 131k context, so compare the specs and pricing tables before choosing a production model.
- Which Grok Image model should I use?
- If price is the main constraint, use the pricing table first because Grok Image does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate Grok Imagine Video 1.5 Preview with multimodal inputs.





