DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Page navigation
目录
Model catalogKimi K2.6Benchmark analysis

Kimi K2.6 Benchmark Details

Kimi K2.6 currently shows benchmark results led by LiveCodeBench (3 / 110, score 89.60), HLE (6 / 133, score 54), AIME 2026 (1 / 13, score 96.40). This page also compares it with 3 competitor models and 3 predecessor or same-series models, including performance and pricing views when available. 1 source link is attached for reference.

Benchmark Results

Kimi K2.6

Benchmark Results

Thinking
All modesThinking
Thinking mode details (1)
All thinking modesDefault (Thinking Enabled)
Tool usage
All modesWith toolsNo tools
Internet
All modesOfflineInternet enabled

综合评估

3 evaluations
Benchmark / mode
Score
Rank/total
GPQA Diamond
Thinking Enabled
90.50
13 / 167
HLE
Thinking Enabled
34.70
45 / 133
HLE
Thinking EnabledToolsInternet
54
6 / 133

编程与软件工程

4 evaluations
Benchmark / mode
Score
Rank/total
LiveCodeBench
Thinking Enabled
89.60
3 / 110
SWE-bench Verified
Thinking EnabledTools
80.20
8 / 96
SWE-bench Multilingual
Thinking EnabledTools
76.70
2 / 10
SWE-Bench Pro - Public
Thinking EnabledTools
58.60
3 / 28

AI Agent - 信息收集

1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
Thinking EnabledToolsInternet
83.20
5 / 37

AI Agent - 工具使用

3 evaluations
Benchmark / mode
Score
Rank/total
OSWorld-Verified
Thinking EnabledTools
73.10
4 / 13
Terminal Bench 2.0
Thinking EnabledTools
66.70
6 / 35
Tool Decathlon
Thinking EnabledTools
50
1 / 7

数学推理

2 evaluations
Benchmark / mode
Score
Rank/total
AIME 2026
Thinking Enabled
96.40
1 / 13
IMO-AnswerBench
Thinking Enabled
86
2 / 10

OpenClaw智能体能力综合测评

1 evaluations
Benchmark / mode
Score
Rank/total
Claw Bench
Thinking EnabledTools
80.90
19 / 28
Compare with other models

Competitor Comparison

Benchmark scores for Kimi K2.6 compared against top models in its class

Kimi K2.6Qwen3.6-Max-PreviewMiniMax-M2.7GLM 5.1
Benchmark categories:
The chart shows each model’s highest score per benchmark within the current filter. See the table below for per-mode details.

Benchmark Score Comparison

9 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.

BenchmarkKimi K2.6CurrentQwen3.6-Max-PreviewMiniMax-M2.7GLM 5.1
GPQA Diamond
综合评估
90.50Thinking Enabled
--
87.00Thinking Enabled
86.20Thinking Enabled
HLE
综合评估
54.00Thinking Enabled | Tools
--
28.00Thinking Enabled
52.30Thinking Enabled | Tools
SWE-Bench Pro - Public
编程与软件工程
58.60Thinking Enabled | Tools
--
56.20Thinking Enabled | Tools
58.40Thinking Enabled | Tools
BrowseComp
AI Agent - 信息收集
83.20Thinking Enabled | Tools
--
--
79.30Thinking Enabled | Tools
Terminal Bench 2.0
AI Agent - 工具使用
66.70Thinking Enabled | Tools
65.40Deep Thinking Mode | Tools
--
63.50Thinking Enabled | Tools
Tool Decathlon
AI Agent - 工具使用
50.00Thinking Enabled | Tools
--
--
40.70Thinking Enabled | Tools
AIME 2026
数学推理
96.40Thinking Enabled
--
--
95.30Thinking Enabled
IMO-AnswerBench
数学推理
86.00Thinking Enabled
--
--
83.80Thinking Enabled
Claw Bench
OpenClaw智能体能力综合测评
80.90Thinking Enabled | Tools
--
91.70Thinking Enabled | Tools
--

Standard API Pricing: Kimi K2.6 vs. Peer Models

Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.

Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens

When a context threshold exists, the charted base price only applies within these limits:

Qwen3.6-Max-Preview: Input <= 32K
ModelSupplierStandard inputStandard outputBase price applies to
Kimi K2.6
Facebook AI研究实验室$0.95 / 1M tokens$4 / 1M tokens—
Qwen3.6-Max-Preview
—624Input <= 32K
MiniMax-M2.7
MiniMaxAI$0.3 / 1M tokens$1.2 / 1M tokens—
GLM 5.1
智谱AI$1.4 / 1M tokens$4.4 / 1M tokens—

Version History

How each version of the Kimi K2.6 series stacks up on benchmark tests

Kimi K2.6Kimi K2.5Kimi K2 ThinkingKimi K2
Benchmark categories:
The chart shows each model’s highest score per benchmark within the current filter. See the table below for per-mode details.

Benchmark Score Comparison

11 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.

BenchmarkKimi K2.6CurrentKimi K2.5Kimi K2 ThinkingKimi K2
GPQA Diamond
综合评估
90.50Thinking Enabled
87.60Thinking Enabled
84.50Thinking Enabled
75.10Standard Mode
HLE
综合评估
54.00Thinking Enabled | Tools
30.10Thinking Enabled
51.00Thinking Enabled | Tools
4.70Standard Mode
LiveCodeBench
编程与软件工程
89.60Thinking Enabled
85.00Thinking Enabled
83.10Thinking Enabled
53.70Standard Mode
SWE-bench Multilingual
编程与软件工程
76.70Thinking Enabled | Tools
73.00Thinking Enabled
--
--
SWE-Bench Pro - Public
编程与软件工程
58.60Thinking Enabled | Tools
50.70Thinking Enabled | Tools
--
--
SWE-bench Verified
编程与软件工程
80.20Thinking Enabled | Tools
76.80Thinking Enabled | Tools
71.30Thinking Enabled | Tools
51.80Standard Mode
BrowseComp
AI Agent - 信息收集
83.20Thinking Enabled | Tools
60.60Thinking Enabled | Tools
60.20Thinking Enabled | Tools
--
Terminal Bench 2.0
AI Agent - 工具使用
66.70Thinking Enabled | Tools
50.80Thinking Enabled | Tools
--
--
AIME 2026
数学推理
96.40Thinking Enabled
92.50Thinking Enabled
--
--
IMO-AnswerBench
数学推理
86.00Thinking Enabled
81.80Thinking Enabled
--
--
Claw Bench
OpenClaw智能体能力综合测评
80.90Thinking Enabled | Tools
81.70Thinking Enabled | Tools
82.50Thinking Enabled | Tools
--

Single-Benchmark Version Trend

Viewing: GPQA Diamond · 综合评估

Benchmark
NormalNormal + ToolsThinkingThinking + ToolsDeepDeep + Tools

X-axis shows model and release date, Y-axis shows score; solid lines connect the same mode across versions, while dotted guides align modes within the same generation.

Standard API Pricing Across the Kimi K2.6 Series

Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.

Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens

ModelSupplierStandard inputStandard outputBase price applies to
Kimi K2.6
Facebook AI研究实验室$0.95 / 1M tokens$4 / 1M tokens—
Kimi K2.5
—0.6 美元/100 万tokens3 美元/100 万tokens—
Kimi K2 Thinking
—0.6 美元/100 万tokens2.5 美元/100 万tokens—
Kimi K2
—0.6 美元/100 万tokens2.5 美元/100 万tokens—

Sources

kimi.com