MiniMax M2.5vsKimi K2.5

MiniMax M2.5 and Kimi K2.5 are tied across 14 shared benchmarks: MiniMax M2.5 leads on 7, Kimi K2.5 leads on 7, with 0 ties and an average score difference of -1.56.

MiniMax M2.5

MiniMaxAI · 2026-02-12 · Reasoning model

Kimi K2.5

Moonshot AI · 2026-01-27 · Multimodal model

MiniMax M2.57 wins(50%)(50%)7 winsKimi K2.5

Benchmark scores

Grouped by capability, sorted by largest gap within each. 14 shared benchmarks.

General Knowledge

Kimi K2.5 5/5

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
HLE	19.40121 / 172Thinking (No Tools)	50.2027 / 172Thinking (With Tools)	-30.80
LiveBench	60.1468 / 115Deep Thinking (No Tools)	69.0742 / 115Thinking (No Tools)	-8.93
ARC-AGI-2	4.9047 / 62Thinking (No Tools)	11.8039 / 62Thinking (No Tools)	-6.90
GPQA Diamond	85.2053 / 187Thinking (No Tools)	87.6037 / 187Thinking (No Tools)	-2.40
ARC-AGI	63.7035 / 68Thinking (No Tools)	65.3034 / 68Thinking (No Tools)	-1.60

Claw-style Agent Evaluation

MiniMax M2.5 2/2

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
Claw Bench	92.104 / 29Thinking (With Tools)	81.7018 / 29Thinking (With Tools)	+10.40
Pinch Bench	87.806 / 37Thinking (With Tools)	84.8017 / 37Thinking (With Tools)	+3

Coding and Software Engineer

MiniMax M2.5 2/2

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
SWE-Bench Pro - Public	55.4026 / 54	50.7041 / 54Thinking (With Tools)	+4.70
SWE-bench Verified	80.2014 / 112	76.8030 / 112Thinking (With Tools)	+3.40

AI Agent - Information Search

MiniMax M2.5 1/1

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
BrowseComp	76.3023 / 53	60.6036 / 53Thinking (With Tools + Internet)	+15.70

AI Agent - Tool Usage

MiniMax M2.5 1/1

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
Terminal Bench 2.0	51.7031 / 47	50.8034 / 47Thinking (With Tools)	+0.90

Long Context

MiniMax M2.5 1/1

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
AA-LCR	69.505 / 15Thinking (No Tools)	6512 / 15Thinking (No Tools)	+4.50

Math and Reasoning

Kimi K2.5 1/1

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
AIME2025	86.3049 / 107Thinking (No Tools)	96.1021 / 107Thinking (No Tools)	-9.80

Productivity Knowledge

Kimi K2.5 1/1

Benchmark	MiniMax M2.5	Kimi K2.5	Diff
GDPval-AA	3617 / 21Thinking (No Tools)	4015 / 21Thinking (No Tools)	-4

Specs

Field	MiniMax M2.5	Kimi K2.5
Publisher	MiniMaxAI	Moonshot AI
Release date	2026-02-12	2026-01-27
Model type	Reasoning model	Multimodal model
Architecture	MoE	MoE
Parameters	229B	1T
Context length	128K	256K
Max output	Not available	16K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	MiniMax M2.5	Kimi K2.5
Text input	$0.3 / 1M tokens	$0.6 / 1M tokens
Text output	$2.4 / 1M tokens	$3 / 1M tokens
Cache read	Not public	$0.1 / 1M tokens

Summary

MiniMax M2.5leads in:Claw-style Agent Evaluation (2/2), Coding and Software Engineer (2/2), AI Agent - Information Search (1/1), AI Agent - Tool Usage (1/1), Long Context (1/1)
Kimi K2.5leads in:General Knowledge (5/5), Math and Reasoning (1/1), Productivity Knowledge (1/1)

On average across the 14 shared benchmarks, Kimi K2.5 scores 1.56 higher.

Largest single-benchmark gap: HLE — MiniMax M2.5 19.40 vs Kimi K2.5 50.20 (-30.80).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

MiniMax M2.5 details Kimi K2.5 details·Customize in compare tool