LLM ReferenceLLM Reference

MiniMax M2.5 Highspeed

minimax-m2.5-highspeed

Proprietary

About

MiniMax M2.5 Highspeed is MiniMax's inference-optimized variant of M2.5, released simultaneously in February 2026. It delivers identical intelligence and outputs to standard M2.5 through a specialized inference engine at lower latency. The model supports a 204,800-token context window, 131,072-token max output, function calling, structured output, and reasoning. API model ID: MiniMax-M2.5-highspeed. It is designed for latency-sensitive interactive applications and automated agent pipelines.

MiniMax M2.5 Highspeed has a 200K-token context window.

MiniMax M2.5 Highspeed input tokens at $0.6/1M, output at $2.4/1M.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode ExecutionPrompt CachingBatch APIAudioFine-tuning

Providers(2)

Compare all →
ProviderInput (per 1M)Output (per 1M)Type
MiniMaxServerless
Novita AI$0.6$2.4Serverless

Rankings

Specifications

Released2026-02-12
Context205K
Max output131,072
ArchitectureDecoder Only
Specializationgeneral
LicenseProprietary

Created by

Developing AI for gaming and entertainment.

Minhang, Shanghai, China
Founded 2021
Website

Providers(2)