Benchmark Results
Benchmark Results
General Knowledge
3 evaluationsCoding and Software Engineer
4 evaluationsAI Agent - Tool Usage
3 evaluationsMath and Reasoning
2 evaluationsCompetitor Comparison
Benchmark scores for Kimi K2.6 compared against top models in its class
12 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Kimi K2.6Current | Qwen3.6-Max-Preview | MiniMax-M2.7 | GLM 5.1 |
|---|---|---|---|---|
GPQA Diamond 综合评估 | 90.50Thinking Enabled | 90.40Thinking Level · High | 87.00Thinking Enabled | 86.20Thinking Enabled |
HLE 综合评估 | 54.00Thinking Enabled | Tools | 50.20Thinking Enabled | Tools | 28.00Thinking Enabled | 52.30Thinking Enabled | Tools |
LiveCodeBench 编程与软件工程 | 89.60Thinking Enabled | 87.10Thinking Level · High | -- | -- |
SWE-bench Multilingual 编程与软件工程 | 76.70Thinking Enabled | Tools | 73.80Thinking Enabled | Tools | -- | -- |
SWE-Bench Pro - Public 编程与软件工程 | 58.60Thinking Enabled | Tools | 56.60Thinking Enabled | Tools | 56.20Thinking Enabled | Tools | 58.40Thinking Enabled | Tools |
SWE-bench Verified 编程与软件工程 | 80.20Thinking Enabled | Tools | 78.80Thinking Enabled | Tools | -- | -- |
BrowseComp AI Agent - 信息收集 | 83.20Thinking Enabled | Tools | -- | -- | 79.30Thinking Enabled | Tools |
Terminal Bench 2.0 AI Agent - 工具使用 | 66.70Thinking Enabled | Tools | 65.40Deep Thinking Mode | Tools | -- | 63.50Thinking Enabled | Tools |
Tool Decathlon AI Agent - 工具使用 | 50.00Thinking Enabled | Tools | -- | -- | 40.70Thinking Enabled | Tools |
AIME 2026 数学推理 | 96.40Thinking Enabled | -- | -- | 95.30Thinking Enabled |
IMO-AnswerBench 数学推理 | 86.00Thinking Enabled | 83.80Thinking Level · High | -- | 83.80Thinking Enabled |
Claw Bench OpenClaw智能体能力综合测评 | 80.90Thinking Enabled | Tools | -- | 91.70Thinking Enabled | Tools | -- |
Standard API Pricing: Kimi K2.6 vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
When a context threshold exists, the charted base price only applies within these limits:
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Kimi K2.6 | Facebook AI研究实验室 | $0.95 / 1M tokens | $4 / 1M tokens | — |
Qwen3.6-Max-Preview | 阿里巴巴 | $1.3 / 1M tokens | $7.8 / 1M tokens | <= 128 |
MiniMax-M2.7 | MiniMaxAI | $0.3 / 1M tokens | $1.2 / 1M tokens | — |
GLM 5.1 | 智谱AI | $1.4 / 1M tokens | $4.4 / 1M tokens | — |
Version History
How each version of the Kimi K2.6 series stacks up on benchmark tests
11 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | Kimi K2.6Current | Kimi K2.5 | Kimi K2 Thinking | Kimi K2 |
|---|---|---|---|---|
GPQA Diamond 综合评估 | 90.50Thinking Enabled | 87.60Thinking Enabled | 84.50Thinking Enabled | 75.10Standard Mode |
HLE 综合评估 | 54.00Thinking Enabled | Tools | 50.20Thinking Enabled | Tools | 51.00Thinking Enabled | Tools | 4.70Standard Mode |
LiveCodeBench 编程与软件工程 | 89.60Thinking Enabled | 85.00Thinking Enabled | 83.10Thinking Enabled | 53.70Standard Mode |
SWE-bench Multilingual 编程与软件工程 | 76.70Thinking Enabled | Tools | 73.00Thinking Enabled | -- | -- |
SWE-Bench Pro - Public 编程与软件工程 | 58.60Thinking Enabled | Tools | 50.70Thinking Enabled | Tools | -- | -- |
SWE-bench Verified 编程与软件工程 | 80.20Thinking Enabled | Tools | 76.80Thinking Enabled | Tools | 71.30Thinking Enabled | Tools | 51.80Standard Mode |
BrowseComp AI Agent - 信息收集 | 83.20Thinking Enabled | Tools | 60.60Thinking Enabled | Tools | 60.20Thinking Enabled | Tools | -- |
Terminal Bench 2.0 AI Agent - 工具使用 | 66.70Thinking Enabled | Tools | 50.80Thinking Enabled | Tools | -- | -- |
AIME 2026 数学推理 | 96.40Thinking Enabled | 92.50Thinking Enabled | -- | -- |
IMO-AnswerBench 数学推理 | 86.00Thinking Enabled | 81.80Thinking Enabled | -- | -- |
Claw Bench OpenClaw智能体能力综合测评 | 80.90Thinking Enabled | Tools | 81.70Thinking Enabled | Tools | 82.50Thinking Enabled | Tools | -- |
Single-Benchmark Version Trend
Viewing: GPQA Diamond · 综合评估
Standard API Pricing Across the Kimi K2.6 Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Kimi K2.6 | Facebook AI研究实验室 | $0.95 / 1M tokens | $4 / 1M tokens | — |