Kimi K2.5

Name: Kimi K2.5
Author: Moonshot AI

Released

2026-03-15

Last refreshed

2026-06-29

Status

Researched 46d ago

ProprietaryCommercial use: conditionalMultimodalCodingRAGAgentsLong contextVisionClassificationJSON / Tool use

Kimi K2.5 is worth evaluating for coding, rag, and agents when its provider route and context window match the workload.

Use it for

Teams evaluating coding, rag, and agents
Workloads that can use a 256k context window
Buyers comparing 4 tracked provider routes

Do not use it for

Workloads where another current model has stronger sourced task evidence

Specifications

Family: Kimi
Released: 2026-03-15
Context: 256k
Parameters: 1T (MoE, 384 experts)
Architecture: Mixture of Experts
Specialization: code
Openness: Proprietary
License: ProprietaryCommercial use: conditional
Weights: Not released
Code: Unknown
Training: Fine-tuned

Created by

Moonshot AI

Lossless long-context AI innovation

Beijing, China

Founded 2023

Website

Pricing

Output / 1M

$2.00

Input / 1M

$0.440

Cheapest of 10 routes · OpenRouter

Providers(10)

Cloudflare Workers AI Fireworks AI OpenRouter Together AI NVIDIA NIM AWS Bedrock Replicate API Microsoft Foundry Vercel AI Gateway Novita AI

View 10 provider routes

About

Kimi K2.5 is Moonshot AI's Kimi model focused on code generation and software engineering. It offers a 256K-token context window and scores 87.9 on GPQA.

Kimi K2.5 is a proprietary model in the Kimi family. The structured metadata tracks a 256k-token context window, multimodal input, function calling, and structured outputs. This page tracks provider routes through Cloudflare Workers AI, Fireworks AI, OpenRouter, and 7 more, with the cheapest tracked route listed at $0.44 input and $2 output per 1M tokens. Headline tracked benchmarks include Google-Proof Q&A 87.9, MMLU PRO 87.1, and BFCL 47.1.

Top use-case fit: coding, agents, and build tasks

Coding

Q/$ C

2 relevant benchmarks in the decision map.

RAG

Included by capability and metadata signals in the decision map.

Agents

Q/$ C

4 relevant benchmarks in the decision map.

Provider price ladder

Compare all 10

Compare API pricing across 4 providers for input and output tokens, batch, and cached reads when available.

Provider	Input / 1M	Output / 1M	Route
OpenRouter	$0.440	$2.00	Serverless
Together AI	$0.500	$2.80	Serverless
AWS Bedrock	$0.600	$3.00	Serverless
Fireworks AI	$0.600	$3.00	Serverless

Available via routers & gateways(8)

LiteLLM

Gateway

Open-source Python SDK and proxy server that unifies 100+ LLM APIs behind a single OpenAI-compatible interface, with load balancing, cost tracking, and configurable failover.

Free OSSMicrosoft Foundry

OpenRouter

Hybrid

Unified hybrid gateway to 400+ models from 60+ providers via a single OpenAI-compatible API, with optional auto-routing that selects the best model per prompt.

PassthroughTogether AIFireworks AI

Portkey

Gateway

Production AI gateway routing to 1,600+ LLMs with failover, load balancing, semantic caching, and guardrails; Apache 2.0 core is fully self-hostable with the complete feature set.

SubscriptionMicrosoft Foundry

Amazon Bedrock Intelligent Prompt Routing

Router

AWS Bedrock's native intelligent prompt router that routes prompts between Anthropic Claude model tiers (Haiku/Sonnet) based on predicted task complexity, with no extra per-routing charge.

PassthroughAWS Bedrock

Azure AI Foundry Model Router

Router

Microsoft Azure AI Foundry's native model router that uses a trained ML model to route each prompt in real time to the optimal Azure-hosted model, with Balanced/Cost/Quality mode selection and automatic failover.

PassthroughMicrosoft Foundry

Helicone

Gateway

Observability-first AI gateway with routing, caching, rate limiting, and request tracing; Apache 2.0 open-source core with a managed hosted tier for logging and analytics.

SubscriptionMicrosoft Foundry

Capabilities

VisionMultimodalFunction CallingStructured Outputs

Benchmark peer barsfor Coding

SWE-bench VerifiedRank 31 of 80

Claude Fable 5

96.0

Claude Mythos Preview

93.9

Claude Opus 4.8

88.6

Claude Opus 4.7

87.6

Kimi K2.5current

76.8

LiveCodeBenchRank 17 of 55

DeepSeek V4 Pro

93.5

Gemini 3.1 Pro Preview

91.7

DeepSeek V4 Flash

91.6

Qwen3.7-Max

91.6

Kimi K2.5current

85.0

Benchmark scores(17)

Scores are benchmark-specific and are direction-aware: the same numeric gap can mean very different outcomes across suites. Use the leaderboard context and this model's provider route to decide whether the winning margin is meaningful for your workload.

Benchmark	Score	Version	Evaluation	Source
Google-Proof Q&A	87.9	diamondObserved 2026-04-18	—	Source
MMLU PRO	87.1	Thinking mode (accuracy)Observed 2026-06-07	—	Source
BFCL	47.1	v4Observed 2026-04-19	—	Source
τ-bench	74.2	τ-benchObserved 2026-04-24	—	Source
MultiChallenge	61.4	MultiChallengeObserved 2026-04-26	—	Source
MMMU Pro	78.5	LLM-Stats aggregatorObserved 2026-06-07	—	Source
SWE-rebench	58.5	pass@1 (best of 5 runs)Observed 2026-05-28	—	Source
AIME 2025	96.1	Thinking mode (accuracy)Observed 2026-06-07	—	Source
Berkeley Function Calling Leaderboard v3	64.5	BFCL v3 (accuracy%)Observed 2026-06-07	—	Source
BrowseComp	60.6	BrowseComp (accuracy%)Observed 2026-06-07	—	Source
Humanity's Last Exam	50.2	HLE-Full with tools (agentic) (accuracy)Observed 2026-06-07	—	Source
LiveCodeBench	85.0	LiveCodeBench v6 (pass@1)Observed 2026-06-07	—	Source
MATH-500	98.0	Thinking mode (accuracy)Observed 2026-06-07	—	Source
MCP-Atlas	29.5	MCP-Atlas (accuracy%)Observed 2026-06-07	—	Source
SWE-bench Verified	76.8	From official GitHub model card (resolved)Observed 2026-06-07	—	Source
Terminal-Bench 2.0	50.8	Terminal-Bench 2.0 (accuracy%)Observed 2026-06-07	—	Source
CursorBench	31.9	CursorBench 3.1Observed 2026-06-30	Configuration: Kimi 2.5 (single reported configuration) Harness: CursorBench 3.1 Evaluator: Cursor Confidence: confirmed Notes: Cursor published one CursorBench 3.1 configuration for this model; no cross-effort selection was needed.	Source

Migration checks

No linked migration route is available for this model yet.

Show all 65 popular comparisonssorted by 7-day search impressions

Kimi K2.5 vs GPT-5.5162 Kimi K2.5 vs DeepSeek V3.1132 Kimi K2.5 vs DeepSeek R1130 Kimi K2.5 vs Xiaomi MiMo-V2.5128 Kimi K2.5 vs Qwen3.5-397B-A17B123 Kimi K2.5 vs Claude Opus 4.5105 Kimi K2.5 vs Qwen3-235B-A22B105 Kimi K2.5 vs GLM-5V-Turbo103 Kimi K2.5 vs Gemini 3 Pro90 Kimi K2.5 vs Xiaomi MiMo-V2.5-TTS-Series85 Kimi K2.5 vs Qwen3.6-35B-A3B79 Kimi K2.5 vs Qwen2.5-72B-Instruct73 Kimi K2.5 vs Together AI Qwen2-72B-Instruct73 Kimi K2.5 vs Claude Opus 4.672 Kimi K2.5 vs Ling-2.6-Flash66 Kimi K2.5 vs Qwen3.6-27B62 Kimi K2.5 vs Grok-358 Kimi K2.5 vs GLM-5 9B53 Kimi K2.5 vs DeepSeek R1 Distill Llama 70B49 Kimi K2.5 vs Mistral Large 3 675B Instruct47 Kimi K2.5 vs GLM-5 Turbo46 Kimi K2.5 vs DeepSeek R1 052846 Kimi K2.5 vs Llama 3.1 70B Instruct45 Kimi K2.5 vs Gemini 2.5 Flash Live API43 Kimi K2.5 vs o338 Kimi K2.5 vs Trinity-Large-Thinking38 Kimi K2.5 vs Claude 3.7 Sonnet36 Kimi K2.5 vs Together AI Qwen2-7B-Instruct33 Kimi K2.5 vs Qwen3.5-35B-A3B32 Kimi K2.5 vs GPT-5.227 Kimi K2.5 vs Llama 3.1 405B Instruct27 Kimi K2.5 vs Qwen2.5-7B-Instruct26 Kimi K2.5 vs Llama 3 70B Instruct26 Kimi K2.5 vs Gemini 2.5 Pro Computer Use Preview26 Kimi K2.5 vs Llama 2 13B Chat24 Kimi K2.5 vs Phi-3 Mini 4k23 Kimi K2.5 vs Mistral Large 222 Kimi K2.5 vs Tencent Hunyuan Turbo S21 Kimi K2.5 vs Gemini 2.5 Flash20 Kimi K2.5 vs Qwen3-9B20 Kimi K2.5 vs Qwen3.5-27B19 Kimi K2.5 vs Mistral Nemotron19 Kimi K2.5 vs Gemma 7B Instruct17 Kimi K2.5 vs o3 Mini17 Kimi K2.5 vs DeepSeek V3.215 Kimi K2.5 vs GPT-5.4 Pro14 Kimi K2.5 vs Llama 3 8B Instruct11 Kimi K2.5 vs Llama 3.2 1B11 Kimi K2.5 vs Llama 2 70B Chat10 Kimi K2.5 vs GPT-5.4-Cyber9 Kimi K2.5 vs o3 Deep Research9 Kimi K2.5 vs Qwen2.5-72B8 Kimi K2.5 vs Together AI - Llama 3 8B Lite7 Kimi K2.5 vs GPT-5.46 Kimi K2.5 vs Mixtral 8x7B5 Kimi K2.5 vs Qwen2-7B-Instruct4 Kimi K2.5 vs Qwen3.5-122B-A10B4 Kimi K2.5 vs Mixtral 8x22B Instruct v0.34 Kimi K2.5 vs StepFun Step-24 Kimi K2.5 vs Llama 3.2 1B Instruct3 Kimi K2.5 vs Qwen3.5-9B3 Kimi K2.5 vs Gemini 2.5 Pro Preview 05-062 Kimi K2.5 vs Phi-4 Mini Flash Reasoning2 Kimi K2.5 vs Qwen2.5-Max1 Kimi K2.5 vs Qwen3-Max0

Frequently asked questions

What is the context window of Kimi K2.5?

Kimi K2.5 has a context window of 256k tokens.

How much does Kimi K2.5 cost?

Kimi K2.5 pricing ranges from $0.44/1M to $0.6/1M input tokens depending on the provider.

When was Kimi K2.5 released?

Kimi K2.5 was released on 2026-03-15.

Which providers offer Kimi K2.5?

Kimi K2.5 is available from 10 providers: Cloudflare Workers AI, Fireworks AI, OpenRouter, Together AI, NVIDIA NIM, AWS Bedrock, Replicate API, Microsoft Foundry, Vercel AI Gateway, Novita AI.

What benchmarks has Kimi K2.5 been tested on?

Kimi K2.5 has been evaluated on 17 benchmarks, including Google-Proof Q&A, MMLU PRO, BFCL, τ-bench, MultiChallenge.

Created by

Moonshot AI

Lossless long-context AI innovation

Beijing, China

Founded 2023

Website

Pricing

Output / 1M

$2.00

Input / 1M

$0.440

Cheapest of 10 routes · OpenRouter

Providers(10)

Cloudflare Workers AI Fireworks AI OpenRouter Together AI NVIDIA NIM AWS Bedrock Replicate API Microsoft Foundry Vercel AI Gateway Novita AI

View 10 provider routes

Kimi K2.5

Use it for

Do not use it for

About

Top use-case fit: coding, agents, and build tasks

Coding

RAG

Agents

Provider price ladder

Available via routers & gateways(8)

LiteLLM

OpenRouter

Portkey

Amazon Bedrock Intelligent Prompt Routing

Azure AI Foundry Model Router

Helicone

Capabilities

Benchmark peer barsfor Coding

Benchmark scores(17)

Migration checks

Rankings & picks(2)

Compare Kimi K2.5 with other models

Comparison and alternatives

Frequently asked questions