Phi-4 Multimodal

Name: Phi-4 Multimodal
Author: Microsoft Research

Released

2025-01-01

Last refreshed

2026-06-04

Status

Researched 26d ago

Open sourceCommercial use: permittedMultimodalLong contextVision

Phi-4 Multimodal is a released long context and vision model with open-source and 128k context; evaluate it while provider pricing coverage matures.

Use it for

Teams evaluating long context and vision
Workloads that can use a 128k context window

Do not use it for

Cost-sensitive launches that need sourced token pricing
Strict JSON or tool-calling flows
Teams that need a tracked hosted API route today

Specifications

Family: Phi-4
Released: 2025-01-01
Context: 128k
Parameters: 5.6B
Knowledge cutoff: 2024-06
Specialization: general
Openness: Open source
License: MITOSI-approvedCommercial use: permitted

Created by

Microsoft Research

Advancing the state-of-the-art in AI and computing.

Redmond, Washington, United States

Founded 1991

Website

Pricing

No tracked provider token pricing is available yet.

Links

Website

About

Microsoft Phi-4 Multimodal is the multimodal variant of Phi-4 capable of processing images and text. Distinct from phi-4-multimodal-instruct (which is the instruction-tuned version). Engineer note: check if same as phi-4-multimodal-instruct in seed; Azure Foundry may list base and instruct as separate SKUs.

Phi-4 Multimodal is an open-source model in the Phi-4 family. The structured metadata tracks a 128k-token context window and multimodal input. No headline benchmark score is tracked for Phi-4 Multimodal yet.

Top use-case fit

Long context

Included by capability and metadata signals in the decision map.

Vision

Included by capability and metadata signals in the decision map.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

VisionMultimodal

Benchmark peer barsfor Long context

No task-mapped benchmark peers are available for this model yet.

Migration checks

No linked migration route is available for this model yet.

Frequently asked questions

What is the context window of Phi-4 Multimodal?

Phi-4 Multimodal has a context window of 128k tokens.

When was Phi-4 Multimodal released?

Phi-4 Multimodal was released on 2025-01-01.

Created by

Microsoft Research

Advancing the state-of-the-art in AI and computing.

Redmond, Washington, United States

Founded 1991

Website

Pricing

No tracked provider token pricing is available yet.

Links

Website