加载中...

大模型评测对比结果

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 4 个模型的评测数据与核心参数。

Kimi K2 0905Kimi K2DeepSeek-V3.1Qwen3-Coder-480B-A35B

规格对比

Moonshot AI

Kimi K2 0905

Kimi K2-Instruct-0905

Release2025-09-05

Context length256K

Parameters10000

常规模式（Non-Thinking Mode）

Model profile Playground

Moonshot AI

Kimi K2

Kimi-K2-0711-Preview

Release2025-07-11

Context length131K

Parameters10000

常规模式（Non-Thinking Mode）

Model profile Playground

DeepSeek-AI

DeepSeek-V3.1

Release2025-08-20

Context length128K

Parameters6710

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

Model profile Playground

阿里巴巴

Qwen3-Coder-480B-A35B

Qwen3-Coder-480B-A35B-Instruct

Release2025-07-23

Context length256K

Parameters4800

常规模式（Non-Thinking Mode）

Model profile Playground

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

All Modes · Exclude Parallel

View

Thinking Mode (Default)

Thinking Mode (Default) - Help

Default: Thinking Mode (Default) (Standard/Medium)
All: Thinking Mode (All)

All Tools

Parallel

Best Overall

Kimi K2 0905 · 52.65

Best Single

DeepSeek-V3.1 · AIME2025 88.40

Thinking Mode (Default)

Kimi K2 0905 · 1 All Modes

Benchmark scores

Higher is usually better; “—” means no score.

Filter: All Modes · Exclude Parallel3 All Modes · 4 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

Benchmark scores

Higher is usually better; “—” means no score.

4 Benchmark3 All Modes

Supported modes:NormalThinkDeepToolParallel

Benchmark	KI Kimi K2 0905Moonshot AI	DE DeepSeek-V3.1DeepSeek-AI	QW Qwen3-Coder-480B-A35B阿里巴巴
Benchmark
综合评估
HLE	21.70	15.90	—
编程与软件工程
SWE-bench Verified	69.20	—	67.00
数学推理
AIME2025	75.20	88.40	—
AI Agent - 工具使用
Terminal-Bench	44.50	—	37.50

Feature compare

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	KI Kimi K2 0905Moonshot AI	KI Kimi K2Moonshot AI	DE DeepSeek-V3.1DeepSeek-AI	QW Qwen3-Coder-480B-A35B阿里巴巴
Model snapshots
Organization	Moonshot AI	Moonshot AI	DeepSeek-AI	阿里巴巴
模型全名	Kimi K2-Instruct-0905	Kimi-K2-0711-Preview	DeepSeek-V3.1	Qwen3-Coder-480B-A35B-Instruct
模型简介	Not provided	Not provided	Not provided	Not provided
模型类型	聊天大模型	聊天大模型	聊天大模型	编程大模型
模型代号	kimi-k2-0905	kimi-k2-0711-base-preview	deepseek-v-3_1	Qwen3-Coder-480B-A35B-Instruct
Release	2025-09-05	2025-07-11	2025-08-20	2025-07-23
MoE	Yes	Yes	Yes	Yes
规格与性能
Context length	256K	131K	128K	256K
Parameters	10000	10000	6710	4800
激活参数量	320	320	370	350
模型规模	100b	100b	100b	100b
模型大小	1.01TB	1.01TB	1340GB	470.77 GB
推理速度
推理等级
最大输出	4096	134144	8192	16384
Supported modes	常规模式（Non-Thinking Mode）	常规模式（Non-Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）	常规模式（Non-Thinking Mode）
开源与许可
Code Open Source	Closed Source	Not provided	Closed Source	Not provided
Weights Open Source	Closed Source	Not provided	Closed Source	Not provided
Commercial use	免费商用授权	免费商用授权	免费商用授权	免费商用授权
Modality support
Text Input/Output	/	/	/	/
Image Input/Output	/	/	/	/
Audio Input/Output	/	/	/	/
Video Input/Output	/	/	/	/
Embedding Input/Output	/	/	/	/
API 接口详情
Text 价格	Input: 0.60 美元/ 100 万tokensOutput: 2.5 美元/ 100 万tokens	Input: 0.6 美元/100 万tokensOutput: 2.5 美元/100 万tokens	Input: 0.56 美元/100 万tokensOutput: 1.68 美元/100 万tokens	Not provided
Image API pricing	Not provided	Not provided	Not provided	Not provided
Audio API pricing	Not provided	Not provided	Not provided	Not provided
Video API pricing	Not provided	Not provided	Not provided	Not provided
Embedding API pricing	Not provided	Not provided	Not provided	Not provided
Resources
GitHub	Not provided	Repo	Not provided	Repo
Hugging Face	Model Page	Model Page	Model Page	Model Page
Official Page	Not provided	Not provided	Not provided	Not provided
Guides	Not provided	Not provided	Not provided	Not provided
Papers		Kimi K2: Open Agentic Intelligence	DeepSeek-V3.1 Release	Qwen3-Coder: Agentic Coding in the World
DataLearnerAI	Moonshot AI发布Kimi K2-Instruct-0905：256K上下文长度加持，全面升级的开放式智能体模型	Kimi开源K2大模型：全球首个开源可商用的1万亿参数规模大模型，MoE架构，评测结果与DeepSeekV3相当，但模型文件有1TB！	DeepSeek V4没有等到，但是DeepSeekAI把DeepSeek V3升级到DeepSeek V3.1了，小幅更新，但核心架构和参数不变	阿里开源全新编程大模型Qwen3-Coder-480B-A35B，官方宣称其编程水平接近Claude Sonnet 4，免费开源可商用，同时开源Claude Code免费平替选择Qwen Code

API pricing

API price comparison

Side-by-side input/output token pricing

Loading comparison...

大模型评测对比结果

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 4 个模型的评测数据与核心参数。

Kimi K2 0905Kimi K2DeepSeek-V3.1Qwen3-Coder-480B-A35B

规格对比

Moonshot AI

Kimi K2 0905

Kimi K2-Instruct-0905

Release2025-09-05

Context length256K

Parameters10000

常规模式（Non-Thinking Mode）

Model profile Playground

Moonshot AI

Kimi K2

Kimi-K2-0711-Preview

Release2025-07-11

Context length131K

Parameters10000

常规模式（Non-Thinking Mode）

Model profile Playground

DeepSeek-AI

DeepSeek-V3.1

Release2025-08-20

Context length128K

Parameters6710

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

Model profile Playground

阿里巴巴

Qwen3-Coder-480B-A35B

Qwen3-Coder-480B-A35B-Instruct

Release2025-07-23

Context length256K

Parameters4800

常规模式（Non-Thinking Mode）

Model profile Playground

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

All Modes · Exclude Parallel

View

Thinking Mode (Default)

Thinking Mode (Default) - Help

Default: Thinking Mode (Default) (Standard/Medium)
All: Thinking Mode (All)

All Tools

Parallel

Best Overall

Kimi K2 0905 · 52.65

Best Single

DeepSeek-V3.1 · AIME2025 88.40

Thinking Mode (Default)

Kimi K2 0905 · 1 All Modes

Benchmark scores

Higher is usually better; “—” means no score.

Filter: All Modes · Exclude Parallel3 All Modes · 4 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

Benchmark scores

Higher is usually better; “—” means no score.

4 Benchmark3 All Modes

Supported modes:NormalThinkDeepToolParallel

Benchmark	KI Kimi K2 0905Moonshot AI	DE DeepSeek-V3.1DeepSeek-AI	QW Qwen3-Coder-480B-A35B阿里巴巴
Benchmark
综合评估
HLE	21.70	15.90	—
编程与软件工程
SWE-bench Verified	69.20	—	67.00
数学推理
AIME2025	75.20	88.40	—
AI Agent - 工具使用
Terminal-Bench	44.50	—	37.50

Feature compare

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	KI Kimi K2 0905Moonshot AI	KI Kimi K2Moonshot AI	DE DeepSeek-V3.1DeepSeek-AI	QW Qwen3-Coder-480B-A35B阿里巴巴
Model snapshots
Organization	Moonshot AI	Moonshot AI	DeepSeek-AI	阿里巴巴
模型全名	Kimi K2-Instruct-0905	Kimi-K2-0711-Preview	DeepSeek-V3.1	Qwen3-Coder-480B-A35B-Instruct
模型简介	Not provided	Not provided	Not provided	Not provided
模型类型	聊天大模型	聊天大模型	聊天大模型	编程大模型
模型代号	kimi-k2-0905	kimi-k2-0711-base-preview	deepseek-v-3_1	Qwen3-Coder-480B-A35B-Instruct
Release	2025-09-05	2025-07-11	2025-08-20	2025-07-23
MoE	Yes	Yes	Yes	Yes
规格与性能
Context length	256K	131K	128K	256K
Parameters	10000	10000	6710	4800
激活参数量	320	320	370	350
模型规模	100b	100b	100b	100b
模型大小	1.01TB	1.01TB	1340GB	470.77 GB
推理速度
推理等级
最大输出	4096	134144	8192	16384
Supported modes	常规模式（Non-Thinking Mode）	常规模式（Non-Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）	常规模式（Non-Thinking Mode）
开源与许可
Code Open Source	Closed Source	Not provided	Closed Source	Not provided
Weights Open Source	Closed Source	Not provided	Closed Source	Not provided
Commercial use	免费商用授权	免费商用授权	免费商用授权	免费商用授权
Modality support
Text Input/Output	/	/	/	/
Image Input/Output	/	/	/	/
Audio Input/Output	/	/	/	/
Video Input/Output	/	/	/	/
Embedding Input/Output	/	/	/	/
API 接口详情
Text 价格	Input: 0.60 美元/ 100 万tokensOutput: 2.5 美元/ 100 万tokens	Input: 0.6 美元/100 万tokensOutput: 2.5 美元/100 万tokens	Input: 0.56 美元/100 万tokensOutput: 1.68 美元/100 万tokens	Not provided
Image API pricing	Not provided	Not provided	Not provided	Not provided
Audio API pricing	Not provided	Not provided	Not provided	Not provided
Video API pricing	Not provided	Not provided	Not provided	Not provided
Embedding API pricing	Not provided	Not provided	Not provided	Not provided
Resources
GitHub	Not provided	Repo	Not provided	Repo
Hugging Face	Model Page	Model Page	Model Page	Model Page
Official Page	Not provided	Not provided	Not provided	Not provided
Guides	Not provided	Not provided	Not provided	Not provided
Papers		Kimi K2: Open Agentic Intelligence	DeepSeek-V3.1 Release	Qwen3-Coder: Agentic Coding in the World
DataLearnerAI	Moonshot AI发布Kimi K2-Instruct-0905：256K上下文长度加持，全面升级的开放式智能体模型	Kimi开源K2大模型：全球首个开源可商用的1万亿参数规模大模型，MoE架构，评测结果与DeepSeekV3相当，但模型文件有1TB！	DeepSeek V4没有等到，但是DeepSeekAI把DeepSeek V3升级到DeepSeek V3.1了，小幅更新，但核心架构和参数不变	阿里开源全新编程大模型Qwen3-Coder-480B-A35B，官方宣称其编程水平接近Claude Sonnet 4，免费开源可商用，同时开源Claude Code免费平替选择Qwen Code

API pricing

API price comparison

Side-by-side input/output token pricing

Kimi K2 0905

Kimi K2

DeepSeek-V3.1

Qwen3-Coder-480B-A35B

Performance benchmarks

Benchmark scores

Benchmark score table

Benchmark scores

Detailed feature breakdown

Model snapshots

规格与性能

开源与许可

Modality support

API 接口详情

Resources

API price comparison

Kimi K2 0905

Kimi K2

DeepSeek-V3.1

Qwen3-Coder-480B-A35B

Performance benchmarks

Benchmark scores

Benchmark score table

Benchmark scores

Detailed feature breakdown

Model snapshots

规格与性能

开源与许可

Modality support

API 接口详情

Resources

API price comparison