Kimi K2.6vsGLM 5.1

Across 9 shared benchmarks, Kimi K2.6 leads overall: Kimi K2.6 wins 8, GLM 5.1 wins 1, with 0 ties and an average score difference of +2.31.

Moonshot AI
Kimi K2.6

Moonshot AI · 2026-04-20 · Reasoning model

智谱AI
GLM 5.1

智谱AI · 2026-03-27 · Reasoning model

Kimi K2.68 wins(89%)(11%)1 winGLM 5.1

Benchmark scores

Grouped by capability, sorted by largest gap within each. 9 shared benchmarks.

AI Agent - Tool Usage

Kimi K2.6 2/3
BenchmarkKimi K2.6GLM 5.1Diff
Tool Decathlon501 / 7Thinking (With Tools)40.703 / 7Thinking (With Tools)+9.30
TerminalBench 2.153.5613 / 13Thinking (No Tools)58.7011 / 13Thinking High (With Tools)-5.14
Terminal Bench 2.066.7010 / 46Thinking (With Tools)63.5013 / 46Thinking (With Tools)+3.20

General Knowledge

Kimi K2.6 2/2
BenchmarkKimi K2.6GLM 5.1Diff
GPQA Diamond90.5016 / 178Thinking (No Tools)86.2042 / 178Thinking (No Tools)+4.30
HLE549 / 157Thinking (With Tools + Internet)52.3012 / 157Thinking (With Tools)+1.70

Math and Reasoning

Kimi K2.6 2/2
BenchmarkKimi K2.6GLM 5.1Diff
IMO-AnswerBench866 / 19Thinking (No Tools)83.8010 / 19Thinking (No Tools)+2.20
AIME 202696.401 / 14Thinking (No Tools)95.302 / 14Thinking (No Tools)+1.10

AI Agent - Information Search

Kimi K2.6 1/1
BenchmarkKimi K2.6GLM 5.1Diff
BrowseComp83.2010 / 45Thinking (With Tools + Internet)79.3013 / 45Thinking (With Tools + Internet)+3.90

Coding and Software Engineer

Kimi K2.6 1/1
BenchmarkKimi K2.6GLM 5.1Diff
SWE-Bench Pro - Public58.607 / 43Thinking (With Tools)58.409 / 43Thinking (With Tools)+0.20

Specs

FieldKimi K2.6GLM 5.1
PublisherMoonshot AI智谱AI
Release date2026-04-202026-03-27
Model typeReasoning modelReasoning model
ArchitectureMoEMoE
Parameters1T75.4B
Context length256K200K
Max outputNot available125K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemKimi K2.6GLM 5.1
Text input$0.95 / 1M tokens$1.4 / 1M tokens
Text output$4 / 1M tokens$4.4 / 1M tokens
Cache read$0.16 / 1M tokens$4.4 / 1M tokens
Cache write$0.95 / 1M tokens$0.26 / 1M tokens

Summary

  • Kimi K2.6leads in:AI Agent - Tool Usage (2/3), General Knowledge (2/2), Math and Reasoning (2/2), AI Agent - Information Search (1/1), Coding and Software Engineer (1/1)

On average across the 9 shared benchmarks, Kimi K2.6 scores 2.31 higher.

Largest single-benchmark gap: Tool Decathlon — Kimi K2.6 50 vs GLM 5.1 40.70 (+9.30).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.