What is MAI used for?

MAI is used for reasoning, code, and image generation. The family description and listed model capabilities point to those workloads as the best fit.

How does MAI compare to Claude 3?

MAI by Microsoft AI is strongest where you need reasoning, while Claude 3 by Anthropic is the closest related family to check for vision and multimodal work. MAI has 12 listed variants and reaches up to 256k context, while Claude 3 reaches up to 200k context, so compare the specs and pricing tables before choosing a production model.

Which MAI model should I use?

For the lowest listed input price, start with MAI-Transcribe-1 through Microsoft Foundry at $0.36/1M input tokens. For the most capable/latest local choice, evaluate MAI-Thinking-1 with 256k context and reasoning, tool use, and function calling.

MAI Models by Microsoft AI

Microsoft AIProprietary

12 models2025–2026Up to 256k ctxFrom $0.36/1M input

Details

ResearcherMicrosoft AI

LicenseProprietary

Commercial useCommercial use: conditional

Models12

Released2025–2026

Max context256k

Capabilities

Vision4 of 12 models

Multimodal7 of 12 models

Reasoning3 of 12 models

Function Calling1 of 12 models

Tool Use2 of 12 models

Links

Website

About

Microsoft AI (MAI) is Microsoft's proprietary model family for Copilot and Azure AI Foundry. The lineup now spans reasoning, coding, image generation/editing, speech synthesis, and transcription models, including MAI-Thinking-1, MAI-Code-1, MAI-Code-1-Flash, MAI-Image-2.5, MAI-Image-2.5-Flash, MAI-Voice-2, and MAI-Transcribe-1.5 alongside the earlier MAI image and speech releases.

Current Variants

Use-when guidance is based on each model's tracked capabilities, context window, release date, and replacement status.

12 in view

MAI-Thinking-1Current

Use when the workload needs reasoning, 256k context, and tool use.

2026-06reasoning256k contexttool use

MAI-Code-1Current

Use when the workload needs code.

2026-06code

MAI-Code-1-FlashCurrent

Use when the workload needs code, 256k context, and reasoning.

2026-06code256k contextreasoning

MAI-Image-2.5Current

Use when the workload needs image generation, 32k context, and multimodal inputs.

2026-06image generation32k contextmultimodal inputs

MAI-Image-2.5-FlashCurrent

Use when the workload needs image generation, 32k context, and multimodal inputs.

2026-06image generation32k contextmultimodal inputs

MAI-Voice-2Current

Use when the workload needs text to speech and audio.

2026-06text to speechaudio

MAI-Transcribe-1.5Current

Use when the workload needs speech recognition, multimodal inputs, and audio.

2026-06speech recognitionmultimodal inputsaudio

MAI-Image-2eCurrent

Use when the workload needs image generation, 33k context, and multimodal inputs.

2026-04image generation33k contextmultimodal inputs

MAI-Transcribe-1Current

Use when the workload needs speech recognition, multimodal inputs, and audio.

2026-04speech recognitionmultimodal inputsaudio

MAI-Voice-1Current

Use when the workload needs text to speech, multimodal inputs, and audio.

2026-04text to speechmultimodal inputsaudio

MAI-Image-2Current

Use when the workload needs image generation and multimodal inputs.

2026-03image generationmultimodal inputs

MAI-DS-R1Current

Use when the workload needs reasoning, 164k context, and 671B parameters.

2025-04reasoning164k context671B parameters

Current MAI variants with use-when guidance and lifecycle status
Model	Use when	Released	Signals	Status
MAI-Thinking-1	Use when the workload needs reasoning, 256k context, and tool use.	2026-06	reasoning256k contexttool use	Current
MAI-Code-1	Use when the workload needs code.	2026-06	code	Current
MAI-Code-1-Flash	Use when the workload needs code, 256k context, and reasoning.	2026-06	code256k contextreasoning	Current
MAI-Image-2.5	Use when the workload needs image generation, 32k context, and multimodal inputs.	2026-06	image generation32k contextmultimodal inputs	Current
MAI-Image-2.5-Flash	Use when the workload needs image generation, 32k context, and multimodal inputs.	2026-06	image generation32k contextmultimodal inputs	Current
MAI-Voice-2	Use when the workload needs text to speech and audio.	2026-06	text to speechaudio	Current
MAI-Transcribe-1.5	Use when the workload needs speech recognition, multimodal inputs, and audio.	2026-06	speech recognitionmultimodal inputsaudio	Current
MAI-Image-2e	Use when the workload needs image generation, 33k context, and multimodal inputs.	2026-04	image generation33k contextmultimodal inputs	Current
MAI-Transcribe-1	Use when the workload needs speech recognition, multimodal inputs, and audio.	2026-04	speech recognitionmultimodal inputsaudio	Current
MAI-Voice-1	Use when the workload needs text to speech, multimodal inputs, and audio.	2026-04	text to speechmultimodal inputsaudio	Current
MAI-Image-2	Use when the workload needs image generation and multimodal inputs.	2026-03	image generationmultimodal inputs	Current
MAI-DS-R1	Use when the workload needs reasoning, 164k context, and 671B parameters.	2025-04	reasoning164k context671B parameters	Current

Release Timeline

4 release groups

2026-06

7 current

MAI-Code-1

code

Current

MAI-Code-1-Flash

code256k contextreasoning

Current

MAI-Image-2.5

image generation32k contextmultimodal inputs

Current

MAI-Image-2.5-Flash

image generation32k contextmultimodal inputs

Current

MAI-Thinking-1

reasoning256k contexttool use

Current

MAI-Transcribe-1.5

speech recognitionmultimodal inputsaudio

Current

MAI-Voice-2

text to speechaudio

Current

2026-04

3 current

MAI-Image-2e

image generation33k contextmultimodal inputs

Current

MAI-Transcribe-1

speech recognitionmultimodal inputsaudio

Current

MAI-Voice-1

text to speechmultimodal inputsaudio

Current

2026-03

1 current

MAI-Image-2

image generationmultimodal inputs

Current

2025-04

1 current

MAI-DS-R1

reasoning164k context671B parameters

Current

Specifications(12 models)

MAI model specifications comparison
Model	Released	Context	Parameters	Vision	Multimodal	Reasoning	Fn Calling	Tool Use
MAI-Thinking-1	2026-06	256k	1T total / 35B active	No	No	Yes	Yes	Yes
MAI-Code-1	2026-06	—	—	No	No	No	No	No
MAI-Code-1-Flash	2026-06	256k	—	No	No	Yes	No	Yes
MAI-Image-2.5	2026-06	32k	—	Yes	Yes	No	No	No
MAI-Image-2.5-Flash	2026-06	32k	—	Yes	Yes	No	No	No
MAI-Voice-2	2026-06	—	—	No	No	No	No	No
MAI-Transcribe-1.5	2026-06	—	—	No	Yes	No	No	No
MAI-Image-2e	2026-04	33k	—	Yes	Yes	No	No	No
MAI-Transcribe-1	2026-04	—	—	No	Yes	No	No	No
MAI-Voice-1	2026-04	—	—	No	Yes	No	No	No
MAI-Image-2	2026-03	—	—	Yes	Yes	No	No	No
MAI-DS-R1	2025-04	164k	671B	No	No	Yes	No	No

Available From(1 provider)

Microsoft Foundry

Pricing

MAI model pricing by provider
Model	Provider	Input / 1M	Output / 1M	Type
MAI-Transcribe-1	Microsoft Foundry	$0.36	—	Serverless
MAI-Code-1-Flash	Microsoft Foundry	$0.75	$4.5	Serverless
MAI-Image-2	Microsoft Foundry	$5	$33	Serverless
MAI-Voice-1	Microsoft Foundry	$22	—	Serverless

Comparisons

All comparisons →

Frequently Asked Questions

What is MAI used for?: MAI is used for reasoning, code, and image generation. The family description and listed model capabilities point to those workloads as the best fit.
How does MAI compare to Claude 3?: MAI by Microsoft AI is strongest where you need reasoning, while Claude 3 by Anthropic is the closest related family to check for vision and multimodal work. MAI has 12 listed variants and reaches up to 256k context, while Claude 3 reaches up to 200k context, so compare the specs and pricing tables before choosing a production model.
Which MAI model should I use?: For the lowest listed input price, start with MAI-Transcribe-1 through Microsoft Foundry at $0.36/1M input tokens. For the most capable/latest local choice, evaluate MAI-Thinking-1 with 256k context and reasoning, tool use, and function calling.

Models(12)

MAI-Thinking-1

2026-06256k1T total / 35B active1 provider

Reasoning

MAI-Code-1

2026-061 provider

MAI-Code-1-Flash

2026-06256k1 provider

MAI-Image-2.5

MAI-Image-2.5-Flash

MAI-Voice-2

MAI-Transcribe-1.5

MAI-Image-2e

MAI-Transcribe-1

MAI-Voice-1

MAI-Image-2

MAI-DS-R1