Cosmos 3 Super Image2Video
Last refreshed 2026-06-01. Next refresh: weekly.
Cosmos 3 Super Image2Video has model metadata, but missing tracked provider pricing keeps it from being a default production pick.
Decision context: Vision task fit, 0 tracked provider routes, and research from 2026-06-01.
Use it for
- Teams evaluating vision
- Workloads that can use a 4k context window
Do not use it for
- Cost-sensitive launches that need sourced token pricing
- Strict JSON or tool-calling flows
- Teams that need a tracked hosted API route today
Cheapest output
-
No tracked output price
Provider routes
0
No provider route in seed
Quality / dollar
Unknown
No task benchmark coverage yet
Freshness
2026-06-01
Researched today
Top use-case fit
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
No tracked provider token pricing is available for this model yet.
Benchmark peer barsfor Vision
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
About
Cosmos 3 Super Image2Video is a 64B-parameter fine-tuned variant of Cosmos 3 Super specialized for temporally coherent image-to-video generation. Takes a single image (jpg/png/webp at 256p-720p) plus an optional text prompt (up to 4096 tokens) and outputs MP4 video with 5-400 frames (default 189) at up to 720p, with optional muxed AAC stereo audio at 48kHz. Ranked #1 on Artificial Analysis image-to-video leaderboard (open models). Available via Hugging Face Diffusers and vLLM-Omni.
Cosmos 3 Super Image2Video has a 4k-token context window.
Capabilities
API Versions
cosmos-3-super-image2video