Phi 4 Multimodal Instruct
phi-4-multimodal-instruct
Last refreshed 2026-05-16. Next refresh: weekly.
Phi 4 Multimodal Instruct is worth evaluating for long context and vision when its provider route and context window match the workload.
Decision context: Long context task fit, 3 tracked provider routes, and research from 2026-01-01.
Use it for
- Teams evaluating long context and vision
- Workloads that can use a 128K context window
- Buyers comparing 3 tracked provider routes
Do not use it for
- Strict JSON or tool-calling flows
Cheapest output
$0.900
Fireworks AI per 1M tokens
Provider routes
3
Tracked API hosts
Quality / dollar
Unknown
No task benchmark coverage yet
Freshness
2026-01-01
Researched 137d ago
Top use-case fit
Long context
Included by capability and metadata signals in the decision map.
Vision
Included by capability and metadata signals in the decision map.
Provider price ladder
| Provider | Input / 1M | Output / 1M | Route |
|---|---|---|---|
| Fireworks AI | $0.900 | $0.900 | Serverless |
| Microsoft Foundry | - | - | ServerlessPartial |
| NVIDIA NIM | - | - | ServerlessPartial |
Benchmark peer barsfor Long context
No task-mapped benchmark peers are available for this model yet.
Migration checks
No linked migration route is available for this model yet.
About
Phi 4 Multimodal Instruct has a 128K-token context window.
Phi 4 Multimodal Instruct input tokens at $0.9/1M, output at $0.9/1M.