Phi-4 Reasoning Vision 15B
Open SourceMultimodal
About
15B parameter open-weight multimodal reasoning model from Microsoft Research for vision-language tasks. Supports image captioning, math/science reasoning, and UI understanding. Released March 2026.
Capabilities
VisionMultimodalReasoningFunction CallingTool UseJSON ModeCode Execution