Kimi K2.6 Benchmark Details
Kimi K2.6 currently shows benchmark results led by LiveCodeBench (3 / 110, score 89.60), HLE (6 / 133, score 54), AIME 2026 (1 / 13, score 96.40). This page also compares it with 3 competitor models and 3 predecessor or same-series models, including performance and pricing views when available. 1 source link is attached for reference.
Benchmark Results
Benchmark Results
综合评估
3 evaluations编程与软件工程
4 evaluationsAI Agent - 工具使用
3 evaluationsCompetitor Comparison
Benchmark scores for Kimi K2.6 compared against top models in its class
Benchmark Score Comparison
9 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Kimi K2.6Current | Qwen3.6-Max-Preview | MiniMax-M2.7 | GLM 5.1 |
|---|---|---|---|---|
GPQA Diamond 综合评估 | 90.50Thinking Enabled | -- | 87.00Thinking Enabled | 86.20Thinking Enabled |
HLE 综合评估 | 54.00Thinking Enabled | Tools | -- | 28.00Thinking Enabled | 52.30Thinking Enabled | Tools |
SWE-Bench Pro - Public 编程与软件工程 | 58.60Thinking Enabled | Tools | -- | 56.20Thinking Enabled | Tools | 58.40Thinking Enabled | Tools |
BrowseComp AI Agent - 信息收集 | 83.20Thinking Enabled | Tools | -- | -- | 79.30Thinking Enabled | Tools |
Terminal Bench 2.0 AI Agent - 工具使用 | 66.70Thinking Enabled | Tools | 65.40Deep Thinking Mode | Tools | -- | 63.50Thinking Enabled | Tools |
Tool Decathlon AI Agent - 工具使用 | 50.00Thinking Enabled | Tools | -- | -- | 40.70Thinking Enabled | Tools |
AIME 2026 数学推理 | 96.40Thinking Enabled | -- | -- | 95.30Thinking Enabled |
IMO-AnswerBench 数学推理 | 86.00Thinking Enabled | -- | -- | 83.80Thinking Enabled |
Claw Bench OpenClaw智能体能力综合测评 | 80.90Thinking Enabled | Tools | -- | 91.70Thinking Enabled | Tools | -- |
Standard API Pricing: Kimi K2.6 vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
When a context threshold exists, the charted base price only applies within these limits:
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Kimi K2.6 | Facebook AI研究实验室 | $0.95 / 1M tokens | $4 / 1M tokens | — |
Qwen3.6-Max-Preview | — | 6 | 24 | Input <= 32K |
MiniMax-M2.7 | MiniMaxAI | $0.3 / 1M tokens | $1.2 / 1M tokens | — |
GLM 5.1 | 智谱AI | $1.4 / 1M tokens | $4.4 / 1M tokens | — |
Version History
How each version of the Kimi K2.6 series stacks up on benchmark tests
Benchmark Score Comparison
11 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | Kimi K2.6Current | Kimi K2.5 | Kimi K2 Thinking | Kimi K2 |
|---|---|---|---|---|
GPQA Diamond 综合评估 | 90.50Thinking Enabled | 87.60Thinking Enabled | 84.50Thinking Enabled | 75.10Standard Mode |
HLE 综合评估 | 54.00Thinking Enabled | Tools | 30.10Thinking Enabled | 51.00Thinking Enabled | Tools | 4.70Standard Mode |
LiveCodeBench 编程与软件工程 | 89.60Thinking Enabled | 85.00Thinking Enabled | 83.10Thinking Enabled | 53.70Standard Mode |
SWE-bench Multilingual 编程与软件工程 | 76.70Thinking Enabled | Tools | 73.00Thinking Enabled | -- | -- |
SWE-Bench Pro - Public 编程与软件工程 | 58.60Thinking Enabled | Tools | 50.70Thinking Enabled | Tools | -- | -- |
SWE-bench Verified 编程与软件工程 | 80.20Thinking Enabled | Tools | 76.80Thinking Enabled | Tools | 71.30Thinking Enabled | Tools | 51.80Standard Mode |
BrowseComp AI Agent - 信息收集 | 83.20Thinking Enabled | Tools | 60.60Thinking Enabled | Tools | 60.20Thinking Enabled | Tools | -- |
Terminal Bench 2.0 AI Agent - 工具使用 | 66.70Thinking Enabled | Tools | 50.80Thinking Enabled | Tools | -- | -- |
AIME 2026 数学推理 | 96.40Thinking Enabled | 92.50Thinking Enabled | -- | -- |
IMO-AnswerBench 数学推理 | 86.00Thinking Enabled | 81.80Thinking Enabled | -- | -- |
Claw Bench OpenClaw智能体能力综合测评 | 80.90Thinking Enabled | Tools | 81.70Thinking Enabled | Tools | 82.50Thinking Enabled | Tools | -- |
Single-Benchmark Version Trend
Viewing: GPQA Diamond · 综合评估
Standard API Pricing Across the Kimi K2.6 Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Kimi K2.6 | Facebook AI研究实验室 | $0.95 / 1M tokens | $4 / 1M tokens | — |
Kimi K2.5 | — | 0.6 美元/100 万tokens | 3 美元/100 万tokens | — |
Kimi K2 Thinking | — | 0.6 美元/100 万tokens | 2.5 美元/100 万tokens | — |
Kimi K2 | — | 0.6 美元/100 万tokens | 2.5 美元/100 万tokens | — |