MiniMax-M2.7vsKimi K2.5

Across 8 shared benchmarks, MiniMax-M2.7 leads overall: MiniMax-M2.7 wins 5, Kimi K2.5 wins 3, with 0 ties and an average score difference of +0.43.

MiniMaxAI · 2026-03-18 · Reasoning model

Moonshot AI · 2026-01-27 · Multimodal model

MiniMax-M2.75 wins(63%)(38%)3 winsKimi K2.5

Benchmark scores

Grouped by capability, sorted by largest gap within each. 8 shared benchmarks.

Kimi K2.5 3/3

Benchmark	MiniMax-M2.7	Kimi K2.5	Diff
HLE	2896 / 172Thinking (No Tools)	50.2027 / 172Thinking (With Tools)	-22.20
LiveBench	63.4956 / 115Deep Thinking (No Tools)	69.0742 / 115Thinking (No Tools)	-5.58
GPQA Diamond	8742 / 187Thinking (No Tools)	87.6037 / 187Thinking (No Tools)	-0.60

MiniMax-M2.7 2/2

Benchmark	MiniMax-M2.7	Kimi K2.5	Diff
Claw Bench	91.705 / 29Thinking (With Tools)	81.7018 / 29Thinking (With Tools)	+10
Pinch Bench	87.109 / 37Thinking (With Tools)	84.8017 / 37Thinking (With Tools)	+2.30

MiniMax-M2.7 1/1

Benchmark	MiniMax-M2.7	Kimi K2.5	Diff
SWE-Bench Pro - Public	56.2024 / 54Thinking (With Tools)	50.7041 / 54Thinking (With Tools)	+5.50

MiniMax-M2.7 1/1

Benchmark	MiniMax-M2.7	Kimi K2.5	Diff
AA-LCR	696 / 15Thinking (With Tools)	6512 / 15Thinking (No Tools)	+4

MiniMax-M2.7 1/1

Benchmark	MiniMax-M2.7	Kimi K2.5	Diff
GDPval-AA	5013 / 21Thinking (No Tools)	4015 / 21Thinking (No Tools)	+10

Prices use DataLearner records when available; missing fields are not inferred.

MiniMax-M2.7leads in:Claw-style Agent Evaluation (2/2), Coding and Software Engineer (1/1), Long Context (1/1), Productivity Knowledge (1/1)
Kimi K2.5leads in:General Knowledge (3/3)

On average across the 8 shared benchmarks, MiniMax-M2.7 scores 0.43 higher.

Largest single-benchmark gap: HLE — MiniMax-M2.7 28 vs Kimi K2.5 50.20 (-22.20).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.