DataLearner 标志DataLearnerAI
最新AI资讯
大模型排行榜
大模型评测基准
大模型列表
大模型对比
资源中心
工具
语言中文
DataLearner 标志DataLearner AI

专注大模型评测、数据资源与实践教学的知识平台,持续更新可落地的 AI 能力图谱。

产品

  • 评测榜单
  • 模型对比
  • 数据资源

资源

  • 部署教程
  • 原创内容
  • 工具导航

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner 持续整合行业数据与案例,为科研、企业与开发者提供可靠的大模型情报与实践指南。

隐私政策服务条款
首页综合排行榜LMArena Coding Arena 代码能力排行榜

LMArena 评测赛道

文本生成代码图像编辑文字生成视频图生视频文生图

LMArena Coding Arena 代码能力排行榜

基于 LMArena Coding Arena 用户匿名投票的最新AI大模型代码编程能力排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。

榜首模型

Opus 4.7 (thinking)

最高得分

1572.00

模型数量

200

数据版本

2026年04月24日

数据来源: LM Arena

关于本排行榜

本排行榜展示了当前 AI 大模型在代码编程任务中的实力排名。数据来源于 LMArena (前身为 LMSYS Chatbot Arena)的 Coding 子赛道,通过真实用户匿名盲测投票评估各模型在代码编程任务中的表现。

评测方法概要

匿名盲测:用户发出编程问题后,由两个"隐藏身份"的模型分别给出代码解答,用户投票选出更好的回答,排除品牌偏见。

Elo 评分:采用 Bradley-Terry 模型计算 Elo 分数,分数越高说明该模型的代码回答越容易被用户选择。

覆盖多种编程场景:包括代码生成、Bug 修复、算法实现、代码解释等高频真实编程场景。

DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。

筛选条件

榜单历史快照月份:

排名总表

排名模型名称得分95% CI投票数机构许可证
1Opus 4.7 (thinking)1572.00+/-171,266AnthropicProprietary
2Opus 4.71560.00+/-151,577AnthropicProprietary
3Claude Opus 4.6 (thinking)1554.00+/-94,483AnthropicProprietary
4Claude Opus 4.61549.00+/-95,165AnthropicProprietary
5Muse Spark1533.00+/-141,754Facebook AI研究实验室Proprietary
6gpt-5.4-high1532.00+/-113,110OpenAIProprietary
7Gemini 3.1 Pro Preview1531.00+/-85,932Google Deep MindProprietary
8Claude Opus 4 (thinking-32k)1531.00+/-77,634AnthropicProprietary
9grok-4.20-beta-0309-reasoning1520.00+/-103,289xAIProprietary
10GLM 5.11520.00+/-122,233智谱AIMIT
11Claude Sonnet 4.61520.00+/-113,109AnthropicProprietary
12gpt-5.2-chat-latest-202602101520.00+/-94,632OpenAIProprietary
13grok-4.20-multi-agent-beta-03091519.00+/-103,455xAIProprietary
14Claude Sonnet 4.5 (thinking-32k)1519.00+/-613,691AnthropicProprietary
15Gemini 3.0 Pro (Preview 11-2025)1519.00+/-78,584Google Deep MindProprietary
16Claude Opus 41519.00+/-711,221AnthropicProprietary
17GPT-5.41516.00+/-103,320OpenAIProprietary
18kimi-k2.61515.00+/-181,025MoonshotModified MIT
19dola-seed-2.0-pro1514.00+/-85,652BytedanceProprietary
20grok-4.20-beta11513.00+/-103,275xAIProprietary
21Opus 4.1 (thinking-16k)1512.00+/-79,850AnthropicProprietary
22Claude Sonnet 4.51510.00+/-613,268AnthropicProprietary
23Gemini 3.0 Flash1509.00+/-86,397Google Deep MindProprietary
24Kimi K2 Thinking1508.00+/-85,630Moonshot AIModified MIT
25qwen3.5-max-preview1506.00+/-112,827AlibabaProprietary
26gpt-5.4-mini-high1505.00+/-122,614OpenAIProprietary
27Opus 4.11505.00+/-515,543AnthropicProprietary
28kimi-k2.5-instant1504.00+/-141,805MoonshotModified MIT
29mimo-v2-pro1503.00+/-112,923XiaomiProprietary
30qwen3.6-plus1502.00+/-151,392AlibabaProprietary
31Grok 4.1 Thinking1501.00+/-610,675xAIProprietary
32gpt-5.3-chat-latest1501.00+/-94,282OpenAIProprietary
33Gemini 3.0 Flash (thinking-minimal)1499.00+/-78,274Google Deep MindProprietary
34deepseek-v4-pro-thinking1499.00+/-19927DeepSeekMIT
35gemma-4-31b1498.00+/-161,345GoogleApache 2.0
36Claude Opus 4 (thinking-16k)1498.00+/-86,677AnthropicProprietary
37longcat-flash-chat-2602-exp1496.00+/-122,444MeituanProprietary
38GPT-5.2 Pro (high)1496.00+/-77,636OpenAIProprietary
39GPT-5.21494.00+/-76,953OpenAIProprietary
40GLM-51493.00+/-94,075智谱AIMIT
41ERNIE 5.01493.00+/-85,797百度Proprietary
42Grok 4.11491.00+/-611,728xAIProprietary
43GPT-5.1 Pro (high)1490.00+/-78,223OpenAIProprietary
44amazon-nova-experimental-chat-26-02-101488.00+/-20842AmazonProprietary
45Qwen3.5-397B-A17B1487.00+/-94,640阿里巴巴Apache 2.0
46kimi-k2-thinking-turbo1487.00+/-610,891MoonshotModified MIT
47GLM-4.71486.00+/-122,414智谱AIMIT
48gemma-4-26b-a4b1481.00+/-151,352GoogleApache 2.0
49Qwen3 Max (Preview)1481.00+/-85,367阿里巴巴Proprietary
50deepseek-v4-pro1480.00+/-171,119DeepSeekMIT
51amazon-nova-experimental-chat-26-01-101480.00+/-21739AmazonProprietary
52deepseek-v4-flash-thinking1479.00+/-19843DeepSeekMIT
53claude-haiku-4-5-202510011476.00+/-613,881AnthropicProprietary
54deepseek-v4-flash1476.00+/-20867DeepSeekMIT
55qwen3-max-2025-09-231475.00+/-132,045AlibabaProprietary
56longcat-flash-chat1474.00+/-132,238MeituanMIT
57DeepSeek V3.2 (thinking)1474.00+/-77,775DeepSeek-AIMIT
58GPT-5.1 Instant1474.00+/-79,131OpenAIProprietary
59DeepSeek V3.2-Exp (thinking)1474.00+/-131,917DeepSeek-AIMIT
60Claude Sonnet 4 (thinking-32k)1472.00+/-86,418AnthropicProprietary
61Qwen3-235B-A22B-25071471.00+/-517,554阿里巴巴Apache 2.0
62ERNIE 5.01471.00+/-131,966百度Proprietary
63chatgpt-4o-latest-202503261469.00+/-515,883OpenAIProprietary
64Mistral Large 31468.00+/-79,091MistralAIApache 2.0
65DeepSeek V3.21468.00+/-79,609DeepSeek-AIMIT
66kimi-k2-0905-preview1467.00+/-132,246MoonshotModified MIT
67GPT-5-Pro (high)1466.00+/-86,364OpenAIProprietary
68MiniMax-M2.71466.00+/-112,714MiniMaxAIModified MIT
69Gemini 2.5 Pro Experimental 03-251466.00+/-522,357Google Deep MindProprietary
70Qwen3-VL-235B-A22B-Instruct1465.00+/-132,319阿里巴巴Apache 2.0
71DeepSeek V3.2-Exp1465.00+/-122,499DeepSeek-AIMIT
72grok-4-1-fast-reasoning1464.00+/-610,141xAIProprietary
73DeepSeek-R1-05281464.00+/-112,729DeepSeek-AIMIT
74Claude Opus 41463.00+/-77,908AnthropicProprietary
75GPT-51463.00+/-85,988OpenAIProprietary
76deepseek-v3.1-terminus-thinking1461.00+/-24637DeepSeekMIT
77gemini-3.1-flash-lite-preview1461.00+/-94,702GoogleProprietary
78gpt-5.4-nano-high1460.00+/-122,497OpenAIProprietary
79GLM-4.61460.00+/-77,496智谱AIMIT
80Kimi K21459.00+/-85,249Moonshot AIModified MIT
81GPT-4.51458.00+/-131,939OpenAIProprietary
82Grok 4 Fast1458.00+/-161,248xAIProprietary
83OpenAI o31458.00+/-611,757OpenAIProprietary
84qwen3-coder-480b-a35b-instruct1456.00+/-94,858AlibabaApache 2.0
85DeepSeek-V3.1 (thinking)1456.00+/-131,906DeepSeek-AIMIT
86gpt-4.1-2025-04-141456.00+/-79,323OpenAIProprietary
87MiniMax M2.51456.00+/-94,829MiniMaxAIModified MIT
88qwen3-vl-235b-a22b-thinking1455.00+/-141,631AlibabaApache 2.0
89GLM-4.51454.00+/-94,771智谱AIMIT
90Magistral-Medium-25061453.00+/-517,051MistralAIProprietary
91qwen3.5-122b-a10b1452.00+/-103,778AlibabaApache 2.0
92Claude Sonnet 3.7 (thinking-32k)1450.00+/-86,196AnthropicProprietary
93Step 3.5 Flash1449.00+/-84,924StepFunAIApache 2.0
94mimo-v2-flash (non-thinking)1449.00+/-77,751XiaomiMIT
95Claude Sonnet 41448.00+/-77,398AnthropicProprietary
96DeepSeek-V3.11446.00+/-122,627DeepSeek-AIMIT
97qwen3-235b-a22b-no-thinking1445.00+/-86,981AlibabaApache 2.0
98qwen3-next-80b-a3b-instruct1445.00+/-94,801AlibabaApache 2.0
99qwen3.5-27b1444.00+/-103,774AlibabaApache 2.0
100DeepSeek-R11444.00+/-122,317DeepSeek-AIMIT
101Grok 31443.00+/-85,401xAIProprietary
102qwen3-235b-a22b-thinking-25071441.00+/-151,612AlibabaApache 2.0
103trinity-large-preview1440.00+/-103,721Arcee AIApache 2.0
104minimax-m2.1-preview1440.00+/-103,431MiniMaxMIT
105qwen3-30b-a3b-instruct-25071439.00+/-94,668AlibabaApache 2.0
106DeepSeek-V3.1 Terminus1439.00+/-21782DeepSeek-AIMIT
107hunyuan-vision-1.5-thinking1438.00+/-27437TencentProprietary
108qwen3.5-35b-a3b1437.00+/-93,878AlibabaApache 2.0
109grok-4-fast-reasoning1437.00+/-93,958xAIProprietary
110amazon-nova-experimental-chat-12-101436.00+/-21704AmazonProprietary
111grok-4-07091435.00+/-78,160xAIProprietary
112o3-mini-high1434.00+/-122,596OpenAIProprietary
113claude-3-5-sonnet-202410221433.00+/-614,970AnthropicProprietary
114qwen3-235b-a22b1433.00+/-94,340AlibabaApache 2.0
115ERNIE 5.01433.00+/-19918百度Proprietary
116mistral-medium-25051433.00+/-85,901MistralProprietary
117mimo-v2-flash (thinking)1432.00+/-122,442XiaomiMIT
118gpt-4.1-mini-2025-04-141432.00+/-76,925OpenAIProprietary
119o1-2024-12-171432.00+/-103,973OpenAIProprietary
120qwen3.5-flash1431.00+/-94,132AlibabaProprietary
121o4-mini-2025-04-161431.00+/-78,722OpenAIProprietary
122mai-1-preview1430.00+/-112,780Microsoft AIProprietary
123gpt-5-mini-high1429.00+/-95,506OpenAIProprietary
124Claude Sonnet 3.71429.00+/-77,149AnthropicProprietary
125gemini-2.5-flash-preview-09-20251428.00+/-86,850GoogleProprietary
126DeepSeek-V3-03241428.00+/-78,377DeepSeek-AIMIT
127glm-4.5-air1426.00+/-86,116Z.aiMIT
128glm-4.7-flash1424.00+/-112,693Z.aiMIT
129Gemini 2.5 Flash1424.00+/-521,705Google Deep MindProprietary
130qwen3-next-80b-a3b-thinking1421.00+/-112,680AlibabaApache 2.0
131amazon-nova-experimental-chat-11-101420.00+/-85,323AmazonProprietary
132GLM-4.6V1420.00+/-25535智谱AIMIT
133o1-preview1416.00+/-95,123OpenAIProprietary
134minimax-m11415.00+/-86,496MiniMaxApache 2.0
135o3-mini1415.00+/-69,462OpenAIProprietary
136mistral-small-25061412.00+/-103,362MistralApache 2.0
137ling-flash-2.01412.00+/-151,528Ant GroupMIT
138amazon-nova-experimental-chat-10-201411.00+/-122,294AmazonProprietary
139intellect-31410.00+/-19971Prime IntellectMIT
140nvidia-nemotron-3-super-120b-a12b1408.00+/-141,716NvidiaNVIDIA Open Model
141qwen3-32b1407.00+/-24513AlibabaApache 2.0
142step-31407.00+/-171,235StepFunApache 2.0
143nvidia-llama-3.3-nemotron-super-49b-v1.51405.00+/-22659NvidiaNvidia Open
144glm-4.5v1404.00+/-18993Z.aiMIT
145qwen2.5-max1402.00+/-85,102AlibabaProprietary
146hunyuan-t1-202507111400.00+/-20806TencentProprietary
147hunyuan-turbos-202502261399.00+/-31275TencentProprietary
148mercury-21398.00+/-21767Inception AIProprietary
149gemini-2.5-flash-lite-preview-09-2025-no-thinking1397.00+/-79,697GoogleProprietary
150nova-2-lite1397.00+/-122,518AmazonProprietary
151claude-3-5-sonnet-202406201396.00+/-713,607AnthropicProprietary
152hunyuan-turbos-202504161394.00+/-141,776TencentProprietary
153llama-3.1-nemotron-ultra-253b-v11391.00+/-30367NvidiaNvidia Open Model
154GPT OSS 120B1390.00+/-86,497OpenAIApache 2.0
155ring-flash-2.01390.00+/-151,540Ant GroupMIT
156grok-3-mini-high1390.00+/-103,301xAIProprietary
157command-a-03-20251389.00+/-610,221CohereCC-BY-NC-4.0
158amazon-nova-experimental-chat-10-091388.00+/-24553AmazonProprietary
159o1-mini1387.00+/-78,478OpenAIProprietary
160deepseek-v31387.00+/-103,280DeepSeekDeepSeek
161qwen3-30b-a3b1386.00+/-94,534AlibabaApache 2.0
162grok-3-mini-beta1386.00+/-94,256xAIProprietary
163magistral-medium-25061385.00+/-122,250MistralProprietary
164olmo-3.1-32b-instruct1385.00+/-122,521Ai2Apache 2.0
165qwq-32b1384.00+/-94,048AlibabaApache 2.0
166gemini-2.5-flash-lite-preview-06-17-thinking1384.00+/-86,013GoogleProprietary
167claude-3-5-haiku-202410221383.00+/-611,251AnthropicProprietary
168minimax-m21383.00+/-151,545MiniMaxApache 2.0
169gpt-5-nano-high1381.00+/-151,688OpenAIProprietary
170qwen-plus-01251379.00+/-18893AlibabaProprietary
171llama-3.1-405b-instruct-bf161374.00+/-76,249MetaLlama 3.1 Community
172deepseek-v2.5-12101374.00+/-171,079DeepSeekDeepSeek
173gpt-4.1-nano-2025-04-141373.00+/-19807OpenAIProprietary
174llama-4-maverick-17b-128e-instruct1372.00+/-76,996MetaLlama 4
175hunyuan-turbo-01101371.00+/-30299TencentProprietary
176step-2-16k-exp-2024121371.00+/-20737StepFunProprietary
177athene-v2-chat1369.00+/-94,019NexusFlowNexusFlow
178GPT OSS 20B1369.00+/-132,168OpenAIApache 2.0
179yi-lightning1368.00+/-104,31601 AIProprietary
180gpt-4o-2024-05-131368.00+/-619,526OpenAIProprietary
181deepseek-v2.51368.00+/-94,252DeepSeekDeepSeek
182mercury1367.00+/-29395Inception AIProprietary
183llama-3.1-405b-instruct-fp81367.00+/-79,714MetaLlama 3.1 Community
184hunyuan-large-2025-02-101366.00+/-25519TencentProprietary
185gemini-2.0-flash-0011365.00+/-76,998GoogleProprietary
186olmo-3-32b-think1364.00+/-181,054Ai2Apache 2.0
187nvidia-nemotron-3-nano-30b-a3b-bf161364.00+/-113,284NvidiaNVIDIA Open Model
188llama-3.3-nemotron-49b-super-v11362.00+/-31286NvidiaNvidia
189llama-4-scout-17b-16e-instruct1361.00+/-95,258MetaLlama
190mistral-small-3.1-24b-instruct-25031361.00+/-86,141MistralApache 2.0
191gpt-4o-2024-08-061360.00+/-87,318OpenAIProprietary
192gemma-3-27b-it1358.00+/-78,080GoogleGemma
193grok-2-2024-08-131358.00+/-710,368xAIProprietary
194qwen2.5-plus-11271356.00+/-141,553AlibabaProprietary
195gemini-1.5-pro-0021356.00+/-79,175GoogleProprietary
196hunyuan-large-vision1355.00+/-19963TencentProprietary
197qwen2.5-72b-instruct1355.00+/-86,688AlibabaQwen
198step-1o-turbo-2025061353.00+/-151,505StepFunProprietary
199mistral-large-24071353.00+/-87,589MistralMistral Research
200Claude3-Opus1352.00+/-633,748AnthropicProprietary

数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。

常见问题 (FAQ)

什么是 LMArena Coding Arena?▼
LMArena Coding Arena 是 LMArena 旗下专注于代码能力的匿名评测平台。用户提交真实编程任务(如调试、代码生成、算法实现),系统将不同模型的输出并排展示(隐藏模型名称),由用户投票选出更好的答案,最终通过 Elo 算法汇总形成动态排行榜。
Coding Arena 与 SWE-bench、HumanEval 等静态基准有什么区别?▼
SWE-bench、HumanEval、MBPP 等静态基准使用固定测试集和自动化评分,可重现性强但容易被针对性优化("刷榜")。Coding Arena 来自真实用户的开放式需求,测试内容不固定,更能反映模型在实际编程场景中的表现,两者互为补充。
国产大模型在代码能力方面表现如何?▼
DeepSeek V3.2、Qwen3-235B 等国产模型在 Coding Arena 表现亮眼,已跻身全球前列。DeepSeek 以 MIT 协议开源,Qwen 系列支持中文编程场景,是开发者选择开源代码模型的重要参考。
如何用 AI 辅助日常编程工作?▼
常见场景包括:代码补全与生成(根据注释或函数签名生成实现)、调试(粘贴报错信息让 AI 定位问题)、代码审查(检查安全漏洞或性能问题)、单元测试生成,以及跨语言翻译(如将 Python 转为 TypeScript)。排行榜靠前的模型在上述场景中通常都有更好的表现。