Kimi K2.6vsGLM 5.1

Across 10 shared benchmarks, Kimi K2.6 leads overall: Kimi K2.6 wins 9, GLM 5.1 wins 1, with 0 ties and an average score difference of +2.27.

Moonshot AI · 2026-04-20 · Reasoning model

智谱AI · 2026-03-27 · Reasoning model

Kimi K2.69 wins(90%)(10%)1 winGLM 5.1

Benchmark scores

Grouped by capability, sorted by largest gap within each. 10 shared benchmarks.

Kimi K2.6 2/3

Benchmark	Kimi K2.6	GLM 5.1	Diff
Tool Decathlon	502 / 9Thinking (With Tools)	40.705 / 9Thinking (With Tools)	+9.30
TerminalBench 2.1	53.5627 / 27Thinking (No Tools)	58.7024 / 27Thinking High (With Tools)	-5.14
Terminal Bench 2.0	66.7010 / 47Thinking (With Tools)	63.5013 / 47Thinking (With Tools)	+3.20

Kimi K2.6 3/3

Benchmark	Kimi K2.6	GLM 5.1	Diff
GPQA Diamond	90.5018 / 187Thinking (No Tools)	86.2047 / 187Thinking (No Tools)	+4.30
LiveBench	72.1728 / 115Thinking (No Tools)	70.1837 / 115Normal (No Tools)	+1.99
HLE	5415 / 172Thinking (With Tools + Internet)	52.3019 / 172Thinking (With Tools)	+1.70

Kimi K2.6 2/2

Benchmark	Kimi K2.6	GLM 5.1	Diff
IMO-AnswerBench	868 / 21Thinking (No Tools)	83.8012 / 21Thinking (No Tools)	+2.20
AIME 2026	96.403 / 18Thinking (No Tools)	95.304 / 18Thinking (No Tools)	+1.10

Kimi K2.6 1/1

Benchmark	Kimi K2.6	GLM 5.1	Diff
BrowseComp	83.2014 / 53Thinking (With Tools + Internet)	79.3017 / 53Thinking (With Tools + Internet)	+3.90

Kimi K2.6 1/1

Benchmark	Kimi K2.6	GLM 5.1	Diff
SWE-Bench Pro - Public	58.6013 / 54Thinking (With Tools)	58.4015 / 54Thinking (With Tools)	+0.20

Prices use DataLearner records when available; missing fields are not inferred.

Kimi K2.6leads in:AI Agent - Tool Usage (2/3), General Knowledge (3/3), Math and Reasoning (2/2), AI Agent - Information Search (1/1), Coding and Software Engineer (1/1)

On average across the 10 shared benchmarks, Kimi K2.6 scores 2.27 higher.

Largest single-benchmark gap: Tool Decathlon — Kimi K2.6 50 vs GLM 5.1 40.70 (+9.30).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.