DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Page navigation
目录
Model catalogKimi K2.6
KI

Kimi K2.6

推理大模型

Kimi K2.6

Release date: 2026-04-20知识截止: 2025-04177
Live demoGitHubHugging FaceCompare
Parameters
10000.0亿
Context length
256K
Chinese support
Supported
Reasoning ability

Kimi K2.6 is an AI model published by Moonshot AI, released on 2026-04-20, for 推理大模型, with 10000.0B parameters, and 256K tokens context length, under the Modified MIT license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Kimi K2.6

Model basics

Reasoning traces
Supported
Thinking modes
Thinking modes not supported
Context length
256K tokens
Max output length
No data
Model type
推理大模型
Release date
2026-04-20
Model file size
No data
MoE architecture
Yes
Total params / Active params
10000.0B / 320B
Knowledge cutoff
2025-04
Kimi K2.6

Open source & experience

Code license
Modified MIT
Weights license
Modified MIT- 免费商用授权
GitHub repo
GitHub link unavailable
Hugging Face
Hugging Face link unavailable
Live demo
https://www.kimi.com/
Kimi K2.6

Official resources

Paper
Kimi K2.6: Advancing Open-Source Coding
DataLearnerAI blog
No blog post yet
Kimi K2.6

API details

API speed
No data
💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.
Learn about pricing modes
Standard
TypeConditionInputOutput
Text-$0.950/ 1M$4.00/ 1M
Cache PricingPrompt Cache
TypeTTLWriteRead
Text1h$0.950/ 1M$0.160/ 1M
Kimi K2.6

Benchmark Results

Kimi K2.6 currently shows benchmark results led by LiveCodeBench (3 / 110, score 89.60), HLE (6 / 133, score 54), AIME 2026 (1 / 13, score 96.40). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

Thinking
All modesThinking
Tool usage
All modesWith toolsNo tools
Internet
All modesOfflineInternet enabled

综合评估

3 evaluations
Benchmark / mode
Score
Rank/total
GPQA Diamond
Thinking Mode
90.50
13 / 167
HLE
Thinking Mode
34.70
45 / 133
HLE
Thinking ModeToolsInternet
54
6 / 133

编程与软件工程

4 evaluations
Benchmark / mode
Score
Rank/total
LiveCodeBench
Thinking Mode
89.60
3 / 110
SWE-bench Verified
Thinking ModeTools
80.20
8 / 96
SWE-bench Multilingual
Thinking ModeTools
76.70
2 / 10
SWE-Bench Pro - Public
Thinking ModeTools
58.60
3 / 28

AI Agent - 信息收集

1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
Thinking ModeToolsInternet
83.20
5 / 37

AI Agent - 工具使用

3 evaluations
Benchmark / mode
Score
Rank/total
OSWorld-Verified
Thinking ModeTools
73.10
4 / 13
Terminal Bench 2.0
Thinking ModeTools
66.70
6 / 35
Tool Decathlon
Thinking ModeTools
50
1 / 7

数学推理

2 evaluations
Benchmark / mode
Score
Rank/total
AIME 2026
Thinking Mode
96.40
1 / 13
IMO-AnswerBench
Thinking Mode
86
2 / 10

OpenClaw智能体能力综合测评

1 evaluations
Benchmark / mode
Score
Rank/total
Claw Bench
Thinking ModeTools
80.90
19 / 28
View benchmark analysisCompare with other models
Kimi K2.6

Publisher

Moonshot AI
Moonshot AI
View publisher details
Kimi K2.6

Model Overview

模型概述与核心目标

Kimi K2.6是由月之暗面(Moonshot AI)于2026年4月20日正式发布并开源的原生多模态Agent模型。该模型是K2系列的第三次重大迭代,继K2.5(2026年1月)之后推出的升级版本[reference:0][reference:1]。其核心演进方向已从单纯追求模型性能提升,转向构建具备任务接管、流程编排与多Agent协同能力的系统级架构,目标定位为Agent的操作系统(OS)[reference:2]。与上一代K2.5相比,K2.6在代码能力上提升约20%,Agent Swarm规模从100个子Agent/1500步扩展至300个子Agent/4000步[reference:3][reference:4]。

架构与技术规格

Kimi K2.6延续了K2.5的混合专家(MoE)架构,总参数量为1万亿(1T),每个token激活的参数量为320亿(32B)[reference:5][reference:6]。模型包含61层(其中1层为Dense层),共384个专家,每token激活8个专家,另含1个始终处于激活状态的共享专家[reference:7][reference:8]。注意力机制采用多头潜在注意力(MLA),激活函数为SwiGLU,注意力隐藏维度7168,MoE专家隐藏维度2048,注意力头数为64[reference:9][reference:10]。上下文窗口为256K tokens(从K2.5的128K升级而来),词表大小160K[reference:11][reference:12]。视觉方面,K2.6原生支持多模态,搭载MoonViT视觉编码器(参数量400M),原生支持图像和视频输入[reference:13][reference:14]。模型部署推荐使用vLLM、SGLang或KTransformers,需transformers版本≥4.57.1、<5.0.0[reference:15]。训练数据的具体构成官方未公开披露。

核心能力与支持模态

模态支持:Kimi K2.6是原生多模态模型,支持文本、图像、视频的输入理解,但不具备图像/视频的生成输出能力[reference:16][reference:17]。

长程编码(Long-Horizon Coding):支持跨语言泛化(Rust、Go、Python等)和跨领域任务(前端开发、DevOps、性能优化)。在官方实测中,K2.6在Mac上使用Zig语言持续优化推理流程12小时、完成4000余次工具调用,吞吐量从约15 tokens/s提升至193 tokens/s;另一案例中自主重构8年历史的开源金融撮合引擎exchange-core,耗时13小时、修改超过4000行代码,中值吞吐提升185%[reference:18][reference:19]。

代码驱动设计(Coding-Driven Design):支持从单条prompt生成带动效的前端界面、调用图像/视频生成工具输出视觉素材,以及覆盖登录、数据库等基础全栈功能[reference:20]。

Agent集群(Agent Swarm):支持横向扩展到300个子Agent并行协同执行4000个协调步骤,K2.6负责全局调度与任务失败后的自动重分配[reference:21]。

主动式编排(Proactive Orchestration):可驱动7×24小时后台运行的自主Agent,主动管理日程、执行代码、跨平台操作。月之暗面RL基础设施团队使用K2.6驱动的Agent已连续自主运行5天,负责监控、故障响应和系统运维[reference:22]。

性能与基准评测

在编程与Agent任务评测中,K2.6的表现如下:

  • SWE-Bench Pro:58.6分,高于GPT-5.4(57.7)和Claude Opus 4.6(53.4)[reference:23]
  • SWE-Bench Verified:80.2分,与Opus 4.6(80.8)和Gemini 3.1 Pro(80.6)相当[reference:24]
  • Terminal-Bench 2.0:66.7分,仅次于Gemini 3.1 Pro(68.5)[reference:25]
  • DeepSearchQA f1-score:92.5分,领先GPT-5.4(78.6)超过13分[reference:26]
  • HLE Full Suite with Tools:54.0分,三个闭源对手均低于此分数[reference:27]

在内部基准Kimi Code Bench上,K2.6较K2.5有显著提升[reference:28]。据CodeBuddy内测数据,工具调用成功率达96.60%;factory.ai独立评估显示K2.6整体较K2.5提升约15%[reference:29]。在纯数学和推理能力方面,K2.6相对闭源模型仍有一定差距:AIME 2026得分96.4%(GPT-5.4为99.2%),GPQA-Diamond得分落后Gemini 3.1 Pro约2—4分[reference:30]。

应用场景与局限

推荐用例:(1)复杂软件工程项目的长期编码与重构;(2)需要多Agent并行协作的批处理任务(如批量生成简历、网页);(3)全栈应用开发与前端界面设计生成;(4)7×24小时自动化运维与系统监控;(5)需要视觉理解的多模态任务(如UI识别、代码驱动的视觉创作)。

已知局限:(1)纯数学推理任务(如AIME 2026)中与GPT-5.4等闭源模型仍有2—4分差距[reference:31];(2)GPQA-Diamond等深度推理基准中落后于Gemini 3.1 Pro约2—4分[reference:32];(3)Toolathlon(50.0)和MCPMark(55.9)等工具调用测试中低于GPT-5.4[reference:33];(4)视觉理解能力整体落后于GPT-4.5[reference:34]。

访问方式与许可

Kimi K2.6已上线Kimi.com官网、最新版Kimi应用、Kimi API和Kimi Code编程助手,所有用户均可使用[reference:35]。模型权重已在Hugging Face开源,采用Modified MIT License(修改版MIT许可)——允许免费用于一般用途,但对于月活用户超过1亿或年收入超过2000万美元的企业,需在界面中明确标注“Kimi K2.6”[reference:36][reference:37]。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信,获得最新 AI 技术推送

DataLearner 官方微信二维码