MiniMax-M2.7vsKimi K2.5

Across 7 shared benchmarks, MiniMax-M2.7 leads overall: MiniMax-M2.7 wins 5, Kimi K2.5 wins 2, with 0 ties and an average score difference of +1.29.

MiniMaxAI
MiniMax-M2.7

MiniMaxAI · 2026-03-18 · Reasoning model

Moonshot AI
Kimi K2.5

Moonshot AI · 2026-01-27 · Multimodal model

MiniMax-M2.75 wins(71%)(29%)2 winsKimi K2.5

Benchmark scores

Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.

Claw-style Agent Evaluation

MiniMax-M2.7 2/2
BenchmarkMiniMax-M2.7Kimi K2.5Diff
Claw Bench91.705 / 29Thinking (With Tools)81.7018 / 29Thinking (With Tools)+10
Pinch Bench87.109 / 37Thinking (With Tools)84.8017 / 37Thinking (With Tools)+2.30

General Knowledge

Kimi K2.5 2/2
BenchmarkMiniMax-M2.7Kimi K2.5Diff
HLE2882 / 157Thinking (No Tools)50.2020 / 157Thinking (With Tools)-22.20
GPQA Diamond8738 / 178Thinking (No Tools)87.6034 / 178Thinking (No Tools)-0.60

Coding and Software Engineer

MiniMax-M2.7 1/1
BenchmarkMiniMax-M2.7Kimi K2.5Diff
SWE-Bench Pro - Public56.2016 / 43Thinking (With Tools)50.7031 / 43Thinking (With Tools)+5.50

Long Context

MiniMax-M2.7 1/1
BenchmarkMiniMax-M2.7Kimi K2.5Diff
AA-LCR694 / 13Thinking (With Tools)6510 / 13Thinking (No Tools)+4

Productivity Knowledge

MiniMax-M2.7 1/1
BenchmarkMiniMax-M2.7Kimi K2.5Diff
GDPval-AA5013 / 21Thinking (No Tools)4015 / 21Thinking (No Tools)+10

Specs

FieldMiniMax-M2.7Kimi K2.5
PublisherMiniMaxAIMoonshot AI
Release date2026-03-182026-01-27
Model typeReasoning modelMultimodal model
ArchitectureMoEMoE
Parameters229B1T
Context length200K256K
Max output200K16K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemMiniMax-M2.7Kimi K2.5
Text input$0.3 / 1M tokensNot public
Text output$1.2 / 1M tokensNot public
Cache read$0.06 / 1M tokensNot public
Cache write$0.375 / 1M tokensNot public

One or both models have incomplete public pricing.

Summary

  • MiniMax-M2.7leads in:Claw-style Agent Evaluation (2/2), Coding and Software Engineer (1/1), Long Context (1/1), Productivity Knowledge (1/1)
  • Kimi K2.5leads in:General Knowledge (2/2)

On average across the 7 shared benchmarks, MiniMax-M2.7 scores 1.29 higher.

Largest single-benchmark gap: HLE — MiniMax-M2.7 28 vs Kimi K2.5 50.20 (-22.20).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.