Best AI video models

Creative video-generation models live in a separate seed catalog from text LLMs. This page is distinct from /best/vision, which ranks multimodal models that understand visual inputs rather than generate clips.

Last refreshed 2026-06-03. Next refresh: weekly.

Compare video generation models across Sora, Veo, Runway, Kling, Luma, and related creative providers. Distinct from /best/vision, which ranks multimodal models that understand video inputs.

How we rank

Video-generation models are tracked in creative seed files separate from text LLMs. This page lists active video models tagged for creative generation until video benchmark rows are standardized.

Eligibility — Active rows from video-models.json. Distinct from /best/vision, which ranks multimodal LLMs that understand image or video inputs.
Ordering — Alphabetical by model name until video-generation benchmark and pricing columns are normalized across creative providers.
Cross-links — Models that also exist in model.json link to /model/[slug] for API pricing; others remain creative-catalog entries only.

Tracked video models (28)

Editorial ordering is alphabetical until video-generation benchmark and pricing rows are standardized across creative providers. Models with a linked /model page include API pricing and provider routes where available.

Model	Lab	Notes
Genmo Mochi 1	Genmo AI	Genmo Mochi 1 via FAL. Open-source text-to-video with strong character consistency. Fixed 5s/480p/30fps output — no resolution or duration controls.
Grok Imagine Video	xAI	xAI Grok Imagine Video via FAL. Text-to-video and image-to-video at up to 720p, durations 1-15 seconds. $0.05/sec. Wider aspect ratio support including 3:2, 2:3, and auto. No native audio.
Grok Imagine Video 1.5 Preview	xAI	xAI Grok Imagine Video 1.5 Preview via the xAI API. Image-to-video with synchronized audio, 6-15 second H.264 MP4 output at 24 FPS, 480p or 720p, and common aspect ratios including 1:1, 16:9, and 9:16. The model-specific xAI docs state it currently does not support text-to-video. Priced at $0.08/sec for 480p or $0.14/sec for 720p, plus $0.01 per image input.
Hunyuan Video	Tencent AI Lab	Tencent HunyuanVideo via FAL. Open-weight text-to-video and image-to-video at 480p/580p/720p, 3s or 5s durations at 24fps. $0.40/video flat pricing. Cinematic motion, strong scene composition.
Kling 2.6 Pro	Kuaishou Technology	Kuaishou Kling 2.6 Pro via FAL. Audio-first video generation with dialogue and SFX in English and Chinese. Durations 5s or 10s at 720p.
Kling O3 Pro	Kuaishou Technology	Kuaishou Kling O3 Pro via FAL. Advanced AI video generation with native audio, durations up to 15 seconds, 720p. Also available via Runware.
Kling O3 Standard	Kuaishou Technology	Kuaishou Kling O3 Standard via FAL. Cost-efficient video generation with native audio, durations up to 15 seconds, 720p. Also available via Runware.
Kling V3 Pro	Kuaishou Technology	Kuaishou Kling V3 Pro via FAL. Native audio, durations up to 15 seconds, 720p, negative prompt support.
Kling V3 Standard	Kuaishou Technology	Kuaishou Kling V3 Standard via FAL. Affordable native-audio video generation, durations up to 15 seconds, 720p, negative prompt support.
LTX Video	Lightricks	Lightricks LTX-Video via FAL. Real-time capable text-to-video and image-to-video at 720p, up to 10 seconds. Open source.
LTX Video Distilled	Lightricks	Lightricks LTX-Video 13B Distilled via FAL. Fastest and cheapest ($0.025/sec) text-to-video and image-to-video at 720p, up to 10 seconds. Open source.
LTX-2.3	Lightricks	Lightricks LTX-2.3 via FAL. Open-source video model with native audio, 4K (2160p) resolution, 25fps, and durations 6-10 seconds. Text-to-video and image-to-video. Also on Runware.
LTX-2.3 Fast	Lightricks	Lightricks LTX-2.3 Fast via FAL. Speed-optimized open-source variant of LTX-2.3 with native audio and 4K at lower cost. Also on Runware.
Luma Ray 2	Luma AI	Luma AI Ray 2 via FAL. Cinematic text-to-video and image-to-video at up to 1080p, 9 seconds. Premium pricing ($0.20/sec). Strong motion coherence.
Pika 2.2	Pika	Pika Labs Pika 2.2 via FAL. Text-to-video and image-to-video at up to 1080p, 10 seconds. Seed and negative prompt supported.
Pika v2 Turbo	Pika	Pika Labs Pika v2 Turbo via FAL. Fast affordable text-to-video and image-to-video at 720p, 5 seconds fixed duration.
PixVerse C1	PixVerse	PixVerse C1 via FAL. Film-grade video generation with native audio including dialogue, ambient sound, and music. Up to 1080p, 15 seconds.
PixVerse V6	PixVerse	PixVerse V6 via FAL. AI video generation with native audio, 360p to 1080p resolutions, durations 1-15 seconds.
Runway Gen-4 Turbo	Runway	Runway Gen-4 Turbo via Runway API. Speed-optimized image-to-video only. Extended durations up to 38 seconds at 720p.
Runway Gen-4.5	Runway	Runway Gen-4.5 via Runway API. Latest evolution of the Gen-4 family with improved motion consistency and extended 16-second duration at 1080p. Supports text-to-video and image-to-video.
Seedance 2.0 Fast	ByteDance	ByteDance Seedance 2.0 Fast via FAL. Speed-optimized variant of Seedance 2.0 for rapid 720p video generation up to 15 seconds with native audio.
Seedance 2.0 Quality	ByteDance	ByteDance Seedance 2.0 Quality via FAL. Advanced multimodal video generation at up to 720p with 15 seconds and native audio. Text-to-video and image-to-video.
Sora 2	OpenAI	OpenAI Sora 2 via FAL. Text-to-video and image-to-video at 720p, durations up to 20 seconds ($0.08/sec). OpenAI's direct Sora API was deprecated March 2026; accessible via FAL aggregator. Pair with Sora 2 Pro (1080p tier).
Sora 2 Pro	OpenAI	OpenAI Sora 2 Pro via FAL. Premium 1080p text-to-video and image-to-video, durations up to 20 seconds. OpenAI direct API deprecated March 2026; accessible via FAL aggregator.
Veo 3.1	Google DeepMind	Google DeepMind Veo 3.1 via Vertex AI. State-of-the-art video generation: up to 4K resolution, 30 seconds, native audio-sync. Only model in category with 4K and 30s duration.
Vidu Q3	Shengshu	Shengshu AI Vidu Q3 via FAL. High-quality video generation with native audio, up to 1080p and 16 seconds. Also available via Runware.
Vidu Q3 Turbo	Shengshu	Shengshu AI Vidu Q3 Turbo via FAL. Fast video generation with native audio, up to 1080p and 16 seconds at half the cost of Q3 standard. Also available via Runware.
Wan 2.7	Alibaba	Alibaba Wan 2.7 via FAL. Text-to-video and image-to-video with synchronized native audio at up to 1080p and 15 seconds. Seed and negative prompt supported.

Editorial video picks - qualitative tiers and Editor's Choice for video generation.
Best multimodal / vision LLMs - image understanding, document QA, and video analysis.
fal.ai, Runware, OpenAI API, Google AI Studio

Best AI video models

How we rank

Tracked video models (28)

Related pages