GPT Image 2
Multimodal
About
OpenAI's image generation model succeeding DALL-E 3, released April 21, 2026. Uses an autoregressive architecture with native reasoning — the model plans structure and composition before generating pixels. Supports up to 4K (4096×4096) resolution, achieves ~99% character-level text accuracy across Latin, CJK, Hindi, and Bengali scripts, and generates images ~2× faster than DALL-E 3. Debuted at #1 on the Image Arena leaderboard by a +242-point margin. Powered by the GPT-5.4 backbone. API: $8/M image input tokens, $2/M cached, $30/M image output tokens. ChatGPT GA April 22, 2026; API access May 2026.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution