Phi-4 Mini Reasoning

Name: Phi-4 Mini Reasoning
Author: Microsoft Research

Released

2026-05-16

Last refreshed

2026-05-22

Status

Researched 39d ago

Open sourceCommercial use: permittedLong context

Phi-4 Mini Reasoning is a released long context model with open-source and 128k context; evaluate it while provider pricing coverage matures.

Use it for

Teams evaluating long context
Workloads that can use a 128k context window

Do not use it for

Cost-sensitive launches that need sourced token pricing
Vision or document-understanding workloads
Strict JSON or tool-calling flows

Specifications

Family: Phi-4
Released: 2026-05-16
Context: 128k
Parameters: 3.8B
Knowledge cutoff: 2025-02
Specialization: reasoning
Openness: Open source
License: MITOSI-approvedCommercial use: permitted

Created by

Microsoft Research

Advancing the state-of-the-art in AI and computing.

Redmond, Washington, United States

Founded 1991

Website

Pricing

No tracked provider token pricing is available yet.

Links

Website

About

Microsoft Phi-4 Mini with reasoning capabilities optimized for step-by-step problem solving. Distinct from phi-4-mini-flash-reasoning (which emphasizes speed). Engineer note: check if same as phi-4-mini-flash-reasoning in seed; may be a different checkpoint.

Phi-4 Mini Reasoning is an open-source model in the Phi-4 family. The structured metadata tracks a 128k-token context window and reasoning. Headline tracked benchmarks include AIME 2024 57.5, MATH-500 94.6, and Google-Proof Q&A 52.0.

Top use-case fit

Long context

Included by capability and metadata signals in the decision map.

Provider price ladder

No tracked provider token pricing is available for this model yet.

Capabilities

Reasoning

Benchmark peer barsfor Long context

No task-mapped benchmark peers are available for this model yet.

Benchmark scores(3)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Source
AIME 2024	57.5	From official Microsoft technical report, Table 3 (accuracy)	https://arxiv.org/html/2504.21233
MATH-500	94.6	From official Microsoft technical report (accuracy)	https://arxiv.org/html/2504.21233
Google-Proof Q&A	52.0	GPQA Diamond (accuracy)	https://arxiv.org/html/2504.21233