LLM ReferenceLLM Reference

GPT Image 2

Multimodal

About

OpenAI's image generation model succeeding DALL-E 3, released April 21, 2026. Uses an autoregressive architecture with native reasoning — the model plans structure and composition before generating pixels. Supports up to 4K (4096×4096) resolution, achieves ~99% character-level text accuracy across Latin, CJK, Hindi, and Bengali scripts, and generates images ~2× faster than DALL-E 3. Debuted at #1 on the Image Arena leaderboard by a +242-point margin. Powered by the GPT-5.4 backbone. API: $8/M image input tokens, $2/M cached, $30/M image output tokens. ChatGPT GA April 22, 2026; API access May 2026.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyGPT Image
Released2026-04-21
Architectureautoregressive
Specializationimage-generation
LicenseProprietary
Trainingpretrained

Created by

Cutting-edge research and development.

San Francisco, California, United States
Founded 2015
Website