DataLearner 标志DataLearnerAI
最新AI资讯
大模型排行榜
大模型评测基准
大模型列表
大模型对比
资源中心
工具
语言中文
DataLearner 标志DataLearner AI

专注大模型评测、数据资源与实践教学的知识平台,持续更新可落地的 AI 能力图谱。

产品

  • 评测榜单
  • 模型对比
  • 数据资源

资源

  • 部署教程
  • 原创内容
  • 工具导航

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner 持续整合行业数据与案例,为科研、企业与开发者提供可靠的大模型情报与实践指南。

隐私政策服务条款
首页综合排行榜LMArena Math Arena 数学推理能力排行榜

LMArena 评测赛道

文本生成代码数学图像编辑文字生成视频图生视频文生图

LMArena Math Arena 数学推理能力排行榜

基于 LMArena Math Arena 用户匿名投票的最新AI大模型数学推理能力排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。

榜首模型

gpt-5.4-high

最高得分

1515.00

模型数量

346

数据版本

2026年05月07日

数据来源: LM Arena

关于本排行榜

本排行榜展示了当前 AI 大模型在数学推理任务中的实力排名。数据来源于 LMArena 的 Math 子赛道,通过真实用户匿名盲测投票评估各模型在数学解题任务中的表现。

评测方法概要

匿名盲测:用户提出数学题目后,由两个"隐藏身份"的模型分别作答,用户投票选出解题更优的一方,排除品牌偏见。

Elo 评分:采用 Bradley-Terry 模型计算 Elo 分数,分数越高说明该模型在数学场景中被用户更频繁地选择。

来源:全部国产模型
榜单历史快照月份:

排名总表

排名模型名称得分95% CI投票数机构许可证
OpenAIgpt-5.4-highOpenAI1515.00+/-181,089OpenAIProprietary

数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。

常见问题 (FAQ)

01

什么是 LMArena Math Arena?

LMArena Math Arena 是 LMArena 旗下专注于数学推理能力的匿名评测平台。用户提交真实数学问题(如代数、几何、竞赛数学等),系统将不同模型的解题过程并排展示(隐藏模型名称),由用户投票选出更好的解答,最终通过 Elo 算法汇总形成动态排行榜。

02

Math Arena 与 MATH-500、AIME 等静态基准有什么区别?

MATH-500、AIME、AMC 等静态基准使用固定题目集和自动评分,可重现性强但容易被针对性优化("刷榜")。Math Arena 来自真实用户的开放式数学问题,测试内容不固定,更能反映模型在实际数学场景中的自然表现,两者互为补充。

03

思考模型(Thinking Model)在数学 Arena 中表现更好吗?

整体而言,具备思维链(Chain-of-Thought)或扩展推理能力的模型在数学 Arena 中往往排名更高。Claude Opus 系列 Thinking 模式、GPT 高算力模式以及 DeepSeek 思考版本均在榜单前列,说明延长推理时间对数学问题的解答质量有显著提升。

04

国产大模型在数学能力方面表现如何?

DeepSeek、Qwen3 系列、GLM 等国产模型在 Math Arena 表现亮眼,已跻身全球前列。DeepSeek 以 MIT 协议开源,Qwen3-235B 等系列支持中文数学场景,是选择开源数学推理模型的重要参考。

覆盖多种数学场景:包括代数、几何、计算推理、竞赛数学等多元化的真实数学任务。

DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。

AnthropicClaude Opus 4.6 (thinking)Anthropic
1512.00
+/-15
1,422
Anthropic
Proprietary
OpenAIgpt-5.5-highOpenAI1507.00+/-27405OpenAIProprietary
4Google Deep MindGemini 3.1 Pro PreviewGoogle Deep Mind1506.00+/-141,879Google Deep MindProprietary
5AnthropicOpus 4.7 (thinking)Anthropic1504.00+/-25546AnthropicProprietary
6AnthropicClaude Opus 4.6Anthropic1500.00+/-151,615AnthropicProprietary
7AnthropicOpus 4.7Anthropic1496.00+/-23588AnthropicProprietary
8OpenAIGPT-5.5OpenAI1495.00+/-29391OpenAIProprietary
9Baiduernie-5.1Baidu1494.00+/-29361BaiduProprietary
10阿里Qwen3.6-Max-Preview阿里巴巴1486.00+/-32292阿里巴巴Proprietary
11XImimo-v2.5-proXiaomi1480.00+/-29362XiaomiMIT
12DeepSeekdeepseek-v4-pro-thinkingDeepSeek1479.00+/-35268DeepSeekMIT
13Google Deep MindGemini 3.0 Pro (Preview 11-2025)Google Deep Mind1478.00+/-122,652Google Deep MindProprietary
14Alibabaqwen3.5-max-previewAlibaba1477.00+/-19943AlibabaProprietary
15Google Deep MindGemini 3.0 FlashGoogle Deep Mind1476.00+/-132,010Google Deep MindProprietary
16Moonshot AIKimi K2.6Moonshot AI1475.00+/-26421Moonshot AIModified MIT
17Moonshot AIKimi K2 ThinkingMoonshot AI1475.00+/-141,767Moonshot AIModified MIT
18智谱GLM 5.1智谱AI1474.00+/-21739智谱AIMIT
19阿里Qwen 3.6 Plus Preview阿里巴巴1474.00+/-23575阿里巴巴Proprietary
20AnthropicClaude Opus 4 (thinking-32k)Anthropic1470.00+/-122,274AnthropicProprietary
21xAIgrok-4.20-beta-0309-reasoningxAI1470.00+/-171,118xAIProprietary
22DeepMindGemma 4 31BDeepMind1468.00+/-28398DeepMindApache 2.0
23DeepMindGemma 4 26B A4BDeepMind1467.00+/-28368DeepMindApache 2.0
24AnthropicClaude Opus 4Anthropic1465.00+/-103,468AnthropicProprietary
25xAIgrok-4.20-multi-agent-beta-0309xAI1465.00+/-171,144xAIProprietary
26FAMuse SparkFacebook AI研究实验室1464.00+/-21691Facebook AI研究实验室Proprietary
27AnthropicClaude Sonnet 4.6Anthropic1463.00+/-171,159AnthropicProprietary
28OpenAIgpt-5.2-chat-latest-20260210OpenAI1462.00+/-151,518OpenAIProprietary
29xAIgrok-4.20-beta1xAI1459.00+/-161,192xAIProprietary
30Google Deep MindGemini 3.0 Flash (minimal)Google Deep Mind1458.00+/-122,552Google Deep MindProprietary
31OpenAIGPT-5.2 Pro (high)OpenAI1456.00+/-122,430OpenAIProprietary
32AnthropicClaude Sonnet 4.5 (thinking-32k)Anthropic1456.00+/-94,076AnthropicProprietary
33OpenAIGPT-5.4OpenAI1455.00+/-171,150OpenAIProprietary
34XImimo-v2-proXiaomi1455.00+/-171,089XiaomiProprietary
35OpenAIGPT-5.1 Pro (high)OpenAI1454.00+/-122,500OpenAIProprietary
36Bytedancedola-seed-2.0-proBytedance1449.00+/-141,651BytedanceProprietary
37OpenAIOpenAI o3OpenAI1448.00+/-103,730OpenAIProprietary
38阿里Qwen3.5-397B-A17B阿里巴巴1446.00+/-151,514阿里巴巴Apache 2.0
39DeepSeek-AIDeepSeek-V4-ProDeepSeek-AI1444.00+/-32302DeepSeek-AIMIT
40智谱GLM-5智谱AI1444.00+/-161,256智谱AIMIT
41Google Deep MindGemini 2.5 Pro Experimental 03-25Google Deep Mind1443.00+/-77,096Google Deep MindProprietary
42DeepSeekdeepseek-v4-flash-thinkingDeepSeek1443.00+/-36255DeepSeekMIT
43AnthropicOpus 4.1 (thinking-16k)Anthropic1443.00+/-113,026AnthropicProprietary
44百度ERNIE 5.0百度1443.00+/-141,796百度Proprietary
45Moonshotkimi-k2.5-instantMoonshot1443.00+/-25514MoonshotModified MIT
46xAIGrok 4.1 ThinkingxAI1442.00+/-103,277xAIProprietary
47xAIGrok 4.3 BetaxAI1440.00+/-34272xAIProprietary
48阿里Qwen3 Max (Preview)阿里巴巴1439.00+/-151,525阿里巴巴Proprietary
49Googlegemini-3.1-flash-lite-previewGoogle1439.00+/-151,532GoogleProprietary
50Moonshot AIKimi K2 Thinking (thinking-turbo)Moonshot AI1438.00+/-103,298Moonshot AIModified MIT
51Meituanlongcat-flash-chat-2602-expMeituan1436.00+/-19883MeituanProprietary
52OpenAIgpt-5.4-nano-highOpenAI1436.00+/-19893OpenAIProprietary
53OpenAIgpt-5.4-mini-highOpenAI1436.00+/-19932OpenAIProprietary
54OpenAIGPT-5-Pro (high)OpenAI1434.00+/-141,890OpenAIProprietary
55Tencenthunyuan-hy3-previewTencent1434.00+/-30284Tencenttencent-hunyuan-community
56OpenAIGPT-5.2OpenAI1433.00+/-132,171OpenAIProprietary
57AnthropicOpus 4.1Anthropic1433.00+/-94,732AnthropicProprietary
58DeepSeek-AIDeepSeek-V4-FlashDeepSeek-AI1432.00+/-36228DeepSeek-AIMIT
59阿里Qwen3.5-27B阿里巴巴1431.00+/-161,217阿里巴巴Apache 2.0
60xAIGrok 4.1xAI1430.00+/-103,814xAIProprietary
61Alibabaqwen3-max-2025-09-23Alibaba1429.00+/-24586AlibabaProprietary
62Amazonamazon-nova-experimental-chat-26-02-10Amazon1429.00+/-39207AmazonProprietary
63DeepSeek-AIDeepSeek V3.2DeepSeek-AI1429.00+/-112,891DeepSeek-AIMIT
64智谱GLM-4.7智谱AI1428.00+/-21712智谱AIMIT
65DeepSeek-AIDeepSeek V3.2-Exp (thinking)DeepSeek-AI1428.00+/-26483DeepSeek-AIMIT
66xAIGrok 4xAI1428.00+/-122,264xAIProprietary
67OpenAIgpt-5.3-chat-latestOpenAI1427.00+/-151,459OpenAIProprietary
68DeepSeek-AIDeepSeek V3.2 (thinking)DeepSeek-AI1425.00+/-122,396DeepSeek-AIMIT
69xAIGrok 4 FastxAI1424.00+/-29399xAIProprietary
70阿里Qwen3.5-122B-A10B阿里巴巴1424.00+/-161,305阿里巴巴Apache 2.0
71AnthropicClaude Sonnet 4.5Anthropic1424.00+/-94,095AnthropicProprietary
72OpenAIGPT-5.1 InstantOpenAI1423.00+/-112,873OpenAIProprietary
73xAIgrok-4-1-fast-reasoningxAI1422.00+/-113,146xAIProprietary
74智谱GLM-4.6智谱AI1421.00+/-132,111智谱AIMIT
75阿里Qwen3-Next阿里巴巴1420.00+/-171,211阿里巴巴Apache 2.0
76阿里Qwen3-235B-A22B-2507阿里巴巴1419.00+/-85,446阿里巴巴Apache 2.0
77AnthropicClaude Opus 4 (thinking-16k)Anthropic1419.00+/-122,244AnthropicProprietary
78Meituanlongcat-flash-chatMeituan1418.00+/-22688MeituanMIT
79DeepSeek-AIDeepSeek V3.2-ExpDeepSeek-AI1417.00+/-21773DeepSeek-AIMIT
80Moonshot AIKimi K2 0905Moonshot AI1415.00+/-21760Moonshot AIModified MIT
81OpenAIOpenAI o4 - miniOpenAI1415.00+/-112,940OpenAIProprietary
82DeepSeek-AIDeepSeek-V3.1DeepSeek-AI1415.00+/-18992DeepSeek-AIMIT
83智谱GLM-4.5智谱AI1414.00+/-151,427智谱AIMIT
84XImimo-v2.5Xiaomi1414.00+/-27376XiaomiMIT
85DeepSeek-AIDeepSeek-V3.1 (thinking)DeepSeek-AI1414.00+/-22665DeepSeek-AIMIT
86OpenAIGPT-5OpenAI1413.00+/-141,786OpenAIProprietary
87Google Deep MindGemini 2.5 Flash-Preview-09-2025Google Deep Mind1413.00+/-131,943Google Deep MindProprietary
88阿里Qwen3-VL-235B-A22B-Instruct阿里巴巴1412.00+/-23702阿里巴巴Apache 2.0
89xAIgrok-4-fast-reasoningxAI1412.00+/-181,084xAIProprietary
90DeepSeek-AIDeepSeek-R1DeepSeek-AI1411.00+/-141,606DeepSeek-AIMIT
91DeepSeekdeepseek-v3.1-terminus-thinkingDeepSeek1410.00+/-40201DeepSeekMIT
92OpenAIGPT-4.5OpenAI1409.00+/-151,393OpenAIProprietary
93Amazonamazon-nova-experimental-chat-26-01-10Amazon1409.00+/-33263AmazonProprietary
94OpenAIo1-2024-12-17OpenAI1409.00+/-112,986OpenAIProprietary
95StepFunAIStep 3.5 FlashStepFunAI1408.00+/-161,304StepFunAIProprietary
96百度ERNIE 5.0百度1408.00+/-23621百度Proprietary
97Google Deep MindGemini 2.5 FlashGoogle Deep Mind1407.00+/-77,374Google Deep MindProprietary
98MiniMaxAIMiniMax-M2.7MiniMaxAI1407.00+/-19845MiniMaxAIModified MIT
99OpenAIo3-mini-highOpenAI1406.00+/-131,909OpenAIProprietary
100阿里Qwen3-VL-235B-A22B-Instruct (thinking)阿里巴巴1405.00+/-28428阿里巴巴Apache 2.0
101OpenAIgpt-5-mini-highOpenAI1405.00+/-151,456OpenAIProprietary
102MiniMaxAIMiniMax M2.5MiniMaxAI1405.00+/-141,633MiniMaxAIModified MIT
103OpenAIchatgpt-4o-latest-20250326OpenAI1404.00+/-85,722OpenAIProprietary
104AnthropicClaude Sonnet 4 (thinking-32k)Anthropic1403.00+/-132,025AnthropicProprietary
105AnthropicClaude Opus 4Anthropic1403.00+/-112,772AnthropicProprietary
106腾讯Hunyuan-T1腾讯AI实验室1402.00+/-38236腾讯AI实验室Proprietary
107MistralAIMistral Large 3MistralAI1401.00+/-112,683MistralAIApache 2.0
108StepFunAIStep 3.5 FlashStepFunAI1400.00+/-141,651StepFunAIApache 2.0
109百度ERNIE 5.0百度1400.00+/-34268百度Proprietary
110Amazonamazon-nova-experimental-chat-12-10Amazon1400.00+/-37234AmazonProprietary
111阿里Qwen3.5-35B-A3B阿里巴巴1400.00+/-161,260阿里巴巴Apache 2.0
112阿里Qwen3-32B阿里巴巴1399.00+/-30316阿里巴巴Apache 2.0
113MistralAIMagistral-Medium-2506MistralAI1398.00+/-85,307MistralAIProprietary
114Alibabaqwen3-235b-a22b-thinking-2507Alibaba1398.00+/-24492AlibabaApache 2.0
115Amazonamazon-nova-experimental-chat-11-10Amazon1398.00+/-151,589AmazonProprietary
116DeepSeek-AIDeepSeek-R1-0528DeepSeek-AI1397.00+/-20869DeepSeek-AIMIT
117Amazonamazon-nova-experimental-chat-10-20Amazon1396.00+/-20805AmazonProprietary
118DeepSeek-AIDeepSeek-V3.1 TerminusDeepSeek-AI1396.00+/-39219DeepSeek-AIMIT
119ARtrinity-large-thinkingArcee AI1395.00+/-21748Arcee AIApache 2.0
120Alibabaqwen3-235b-a22b-no-thinkingAlibaba1395.00+/-122,395AlibabaApache 2.0
121阿里Qwen3-235B-A22B阿里巴巴1393.00+/-141,604阿里巴巴Apache 2.0
122AnthropicHaiku 4.5Anthropic1393.00+/-94,133AnthropicProprietary
123MiniMaxAIM2.1MiniMaxAI1393.00+/-181,017MiniMaxAIMIT
124Microsoft AImai-1-previewMicrosoft AI1392.00+/-19891Microsoft AIProprietary
125智谱GLM-4.5-Air智谱AI1391.00+/-151,540智谱AIMIT
126Nvidianvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia1390.00+/-39194NvidiaNvidia Open
127阿里Qwen3-Next (thinking)阿里巴巴1390.00+/-20829阿里巴巴Apache 2.0
128OpenAIOpenAI o3-mini (high)OpenAI1389.00+/-18976OpenAIProprietary
129AnthropicClaude Sonnet 4Anthropic1388.00+/-122,478AnthropicProprietary
130Moonshot AIKimi K2Moonshot AI1388.00+/-141,696Moonshot AIModified MIT
131OpenAIOpenAI o1OpenAI1385.00+/-104,569OpenAIProprietary
132PRintellect-3Prime Intellect1384.00+/-31332Prime IntellectMIT
133AnthropicClaude Sonnet 3.7 (thinking-32k)Anthropic1384.00+/-112,794AnthropicProprietary
134OpenAIGPT OSS 120BOpenAI1383.00+/-141,794OpenAIApache 2.0
135OpenAIo3-miniOpenAI1382.00+/-84,723OpenAIProprietary
136阿里Qwen3-30B-A3B-2507阿里巴巴1381.00+/-151,429阿里巴巴Apache 2.0
137Nvidiallama-3.1-nemotron-ultra-253b-v1Nvidia1379.00+/-37209NvidiaNvidia Open Model
138XImimo-v2-flash (non-thinking)Xiaomi1378.00+/-122,324XiaomiMIT
139Nvidianvidia-nemotron-3-super-120b-a12bNvidia1378.00+/-25510NvidiaNVIDIA Open Model
140阿里Qwen3-Coder-480B-A35B阿里巴巴1377.00+/-151,629阿里巴巴Apache 2.0
141xAIGrok 3xAI1375.00+/-112,677xAIProprietary
142XImimo-v2-flash (thinking)Xiaomi1375.00+/-22635XiaomiMIT
143OpenAIGPT-4.1OpenAI1373.00+/-103,232OpenAIProprietary
144MiniMaxminimax-m1MiniMax1371.00+/-131,802MiniMaxApache 2.0
145DeepSeek-AIDeepSeek-V3-0324DeepSeek-AI1370.00+/-103,193DeepSeek-AIMIT
146xAIgrok-3-mini-betaxAI1370.00+/-141,531xAIProprietary
147智谱GLM-4.7-Flash智谱AI1366.00+/-21716智谱AIMIT
148Google Deep MindGemini 2.5 Flash-Lite (thinking)Google Deep Mind1365.00+/-122,099Google Deep MindProprietary
149Google Deep MindGemini 2.5 Flash-Lite-Preview-09-2025 (no-thinking)Google Deep Mind1365.00+/-112,884Google Deep MindProprietary
150阿里Qwen2.5-Max阿里巴巴1364.00+/-103,306阿里巴巴Proprietary
151StepFunAIStep3StepFunAI1364.00+/-31353StepFunAIApache 2.0
152阿里QwQ-32B阿里巴巴1364.00+/-141,720阿里巴巴Apache 2.0
153OpenAIOpenAI o1-miniOpenAI1362.00+/-87,499OpenAIProprietary
154AnthropicClaude Sonnet 3.7Anthropic1362.00+/-103,359AnthropicProprietary
155MiniMaxAIMiniMax M2MiniMaxAI1358.00+/-33317MiniMaxAIApache 2.0
156智谱GLM-4.5V智谱AI1357.00+/-34277智谱AIMIT
157DeepMindGemini 2.0 Flash ExperimentalDeepMind1356.00+/-94,067DeepMindProprietary
158ANling-flash-2.0Ant Group1355.00+/-27462Ant GroupMIT
159ARtrinity-large-previewArcee AI1355.00+/-161,380Arcee AIApache 2.0
160OpenAIGPT-4.1 miniOpenAI1354.00+/-112,694OpenAIProprietary
161Nvidianvidia-nemotron-3-nano-30b-a3b-bf16Nvidia1353.00+/-19993NvidiaNVIDIA Open Model
162阿里Qwen3-30B-A3B阿里巴巴1353.00+/-141,709阿里巴巴Apache 2.0
163AnthropicClaude 3.5 SonnetAnthropic1349.00+/-710,021AnthropicProprietary
164Mistralmistral-medium-2505Mistral1349.00+/-122,231MistralProprietary
165Tencenthunyuan-turbos-20250416Tencent1349.00+/-20845TencentProprietary
166OpenAIgpt-5-nano-highOpenAI1345.00+/-27494OpenAIProprietary
167AnthropicClaude 3.5 SonnetAnthropic1341.00+/-711,359AnthropicProprietary
168ANring-flash-2.0Ant Group1340.00+/-27454Ant GroupMIT
169MistralAIMistral-Small-3.2MistralAI1339.00+/-181,042MistralAIApache 2.0
170Google Deep MindGemini 1.5 ProGoogle Deep Mind1338.00+/-77,610Google Deep MindProprietary
171OpenAIGPT OSS 20BOpenAI1336.00+/-22680OpenAIApache 2.0
172亚马Nova 2 Lite亚马逊1334.00+/-20833亚马逊Proprietary
173DeepMindGemini 2.0 Flash-LiteDeepMind1325.00+/-102,814DeepMindProprietary
174Alibabaqwen-plus-0125Alibaba1324.00+/-19732AlibabaProprietary
175Google Deep MindGemma 3 - 27B (IT)Google Deep Mind1322.00+/-93,579Google Deep MindGemma
176Metallama-4-maverick-17b-128e-instructMeta1319.00+/-112,840MetaLlama 4
177Metallama-3.1-405b-instruct-fp8Meta1318.00+/-78,482MetaLlama 3.1 Community
178Google Deep MindGemma 3 - 12B (IT)Google Deep Mind1317.00+/-27389Google Deep MindGemma
179Metallama-3.1-405b-instruct-bf16Meta1315.00+/-85,215MetaLlama 3.1 Community
180StepFunstep-2-16k-exp-202412StepFun1313.00+/-20642StepFunProprietary
181NEathene-v2-chatNexusFlow1312.00+/-93,412NexusFlowNexusFlow
182AnthropicClaude3-OpusAnthropic1311.00+/-625,769AnthropicProprietary
183DeepSeek-AIDeepSeek-V3DeepSeek-AI1311.00+/-112,721DeepSeek-AIDeepSeek
184AIolmo-3-32b-thinkAi21311.00+/-32314Ai2Apache 2.0
185CohereAIC4AI Command A (202503)CohereAI1309.00+/-93,995CohereAICC-BY-NC-4.0
186Metallama-4-scout-17b-16e-instructMeta1309.00+/-131,944MetaLlama
187OpenAIGPT-4oOpenAI1308.00+/-86,826OpenAIProprietary
188AIolmo-3.1-32b-instructAi21306.00+/-23695Ai2Apache 2.0
18901yi-lightning01 AI1306.00+/-103,92101 AIProprietary
190Alibabaqwen2.5-plus-1127Alibaba1305.00+/-141,404AlibabaProprietary
191OpenAIGPT-4oOpenAI1305.00+/-715,103OpenAIProprietary
192Googlegemini-advanced-0514Google1304.00+/-96,395GoogleProprietary
193OpenAIgpt-4-1106-previewOpenAI1303.00+/-813,306OpenAIProprietary
194Tencenthunyuan-turbos-20250226Tencent1301.00+/-31238TencentProprietary
195StepFunstep-1o-turbo-202506StepFun1300.00+/-24565StepFunProprietary
196OpenAIgpt-4-0125-previewOpenAI1299.00+/-812,374OpenAIProprietary
197ZHglm-4-plus-0111Zhipu1297.00+/-19721ZhipuProprietary
198Alibabaqwen2.5-72b-instructAlibaba1296.00+/-85,415AlibabaQwen
199Google Deep MindGemini 1.5 ProGoogle Deep Mind1296.00+/-810,492Google Deep MindProprietary
200OpenAIgpt-4-turbo-2024-04-09OpenAI1296.00+/-713,217OpenAIProprietary
201Metallama-3.3-70b-instructMeta1296.00+/-85,778MetaLlama-3.3
202AIolmo-3.1-32b-thinkAi21295.00+/-26473Ai2Apache 2.0
203xAIGrok 2xAI1293.00+/-78,950xAIProprietary
204Tencenthunyuan-large-2025-02-10Tencent1293.00+/-24497TencentProprietary
205DeepSeekdeepseek-v2.5-1210DeepSeek1293.00+/-171,031DeepSeekDeepSeek
206Alibabaqwen-max-0919Alibaba1291.00+/-122,249AlibabaQwen
207Tencenthunyuan-standard-2025-02-10Tencent1290.00+/-24499TencentProprietary
208Googlegemini-1.5-flash-002Google1288.00+/-94,789GoogleProprietary
209Mistralmistral-large-2407Mistral1288.00+/-86,664MistralMistral Research
210DeepSeekdeepseek-v2.5DeepSeek1287.00+/-103,649DeepSeekDeepSeek
211ZHglm-4-plusZhipu AI1287.00+/-103,599Zhipu AIProprietary
212Mistralmagistral-medium-2506Mistral1285.00+/-26554MistralProprietary
213AnthropicClaude 3.5 HaikuAnthropic1285.00+/-76,367AnthropicProprietary
214OpenAIgpt-4-0314OpenAI1282.00+/-107,052OpenAIProprietary
215Mistralmistral-large-2411Mistral1282.00+/-93,574MistralMRL
216Tencenthunyuan-large-visionTencent1280.00+/-30351TencentProprietary
217IBibm-granite-h-smallIBM1280.00+/-32358IBMApache 2.0
218Tencenthunyuan-turbo-0110Tencent1279.00+/-31243TencentProprietary
219Nvidiallama-3.1-nemotron-70b-instructNvidia1278.00+/-171,041NvidiaLlama 3.1
220Mistralmistral-small-3.1-24b-instruct-2503Mistral1277.00+/-132,131MistralApache 2.0
221OpenAIgpt-4o-mini-2024-07-18OpenAI1276.00+/-79,322OpenAIProprietary
222OpenAIgpt-4.1-nano-2025-04-14OpenAI1274.00+/-23582OpenAIProprietary
223OpenAIgpt-4-0613OpenAI1274.00+/-811,181OpenAIProprietary
224Alibabaqwen2-72b-instructAlibaba1273.00+/-94,835AlibabaQianwen LICENSE
225xAIgrok-2-mini-2024-08-13xAI1272.00+/-87,261xAIProprietary
226DeepSeekdeepseek-coder-v2DeepSeek1270.00+/-131,858DeepSeekDeepSeek License
227Nvidiallama-3.1-nemotron-51b-instructNvidia1270.00+/-22507NvidiaLlama 3.1
228Alibabaqwen2.5-coder-32b-instructAlibaba1270.00+/-19725AlibabaApache 2.0
229Amazonamazon-nova-pro-v1.0Amazon1269.00+/-102,978AmazonProprietary
230Metallama-3.1-70b-instructMeta1269.00+/-87,677MetaLlama 3.1 Community
231Microsoft AzurePhi 4 - 14BMicrosoft Azure1265.00+/-102,764Microsoft AzureMIT
232AIllama-3.1-tulu-3-70bAi21263.00+/-25397Ai2Llama 3.1
233Mistralmistral-small-24b-instruct-2501Mistral1261.00+/-131,683MistralApache 2.0
234NEathene-70b-0725NexusFlow1260.00+/-102,921NexusFlowCC-BY-NC-4.0
235Googlegemma-3n-e4b-itGoogle1260.00+/-151,572GoogleGemma
236Metallama-3-70b-instructMeta1257.00+/-720,941MetaLlama 3 Community
237Googlegemini-1.5-flash-001Google1256.00+/-88,392GoogleProprietary
238Googlegemma-3-4b-itGoogle1253.00+/-28423GoogleGemma
239Anthropicclaude-3-sonnet-20240229Anthropic1253.00+/-813,766AnthropicProprietary
240Nvidianemotron-4-340b-instructNvidia1252.00+/-122,352NvidiaNVIDIA Open Model
241Tencenthunyuan-standard-256kTencent1250.00+/-29361TencentProprietary
242ZHglm-4-0520Zhipu AI1246.00+/-161,191Zhipu AIProprietary
243REreka-core-20240904Reka AI1245.00+/-141,207Reka AIProprietary
244Amazonamazon-nova-lite-v1.0Amazon1244.00+/-112,511AmazonProprietary
245AIjamba-1.5-largeAI21 Labs1244.00+/-151,147AI21 LabsJamba Open
246Googlegemma-2-27b-itGoogle1244.00+/-710,170GoogleGemma license
247Mistralmistral-large-2402Mistral1244.00+/-97,987MistralProprietary
248Coherec4ai-aya-expanse-32bCohere1231.00+/-103,854CohereCC-BY-NC-4.0
249REreka-flash-20240904Reka AI1231.00+/-141,284Reka AIProprietary
250Anthropicclaude-3-haiku-20240307Anthropic1230.00+/-714,983AnthropicProprietary
251Coherecommand-r-plus-08-2024Cohere1230.00+/-141,467CohereCC-BY-NC-4.0
252Googlegemini-1.5-flash-8b-001Google1228.00+/-85,036GoogleProprietary
253Mistralmixtral-8x22b-instruct-v0.1Mistral1227.00+/-96,778MistralApache 2.0
254AIolmo-2-0325-32b-instructAi21227.00+/-28375Ai2Apache-2.0
255Amazonamazon-nova-micro-v1.0Amazon1224.00+/-112,455AmazonProprietary
256Alibabaqwen1.5-110b-chatAlibaba1221.00+/-113,188AlibabaQianwen LICENSE
257Mistralmistral-mediumMistral1220.00+/-114,406MistralProprietary
258Googlegemma-2-9b-itGoogle1217.00+/-87,110GoogleGemma license
259Microsoftphi-3-medium-4k-instructMicrosoft1215.00+/-113,238MicrosoftMIT
260Alibabaqwq-32b-previewAlibaba1213.00+/-24480AlibabaApache 2.0
261Mistralministral-8b-2410Mistral1213.00+/-20683MistralMRL
26201yi-1.5-34b-chat01 AI1213.00+/-112,98501 AIApache-2.0
263Coherecommand-r-plusCohere1212.00+/-89,769CohereCC-BY-NC-4.0
264REreka-flash-21b-20240226-onlineReka AI1211.00+/-142,028Reka AIProprietary
265Alibabaqwen1.5-72b-chatAlibaba1208.00+/-105,327AlibabaQianwen LICENSE
266AIllama-3.1-tulu-3-8bAi21206.00+/-26363Ai2Llama 3.1
267INinternlm2_5-20b-chatInternLM1206.00+/-151,387InternLMOther
268Coherecommand-r-08-2024Cohere1205.00+/-141,601CohereCC-BY-NC-4.0
269PRgemma-2-9b-it-simpoPrinceton1204.00+/-151,285PrincetonMIT
270OpenAIgpt-3.5-turbo-1106OpenAI1202.00+/-152,134OpenAIProprietary
271Alibabaqwen1.5-32b-chatAlibaba1200.00+/-122,649AlibabaQianwen LICENSE
272Coherec4ai-aya-expanse-8bCohere1199.00+/-151,307CohereCC-BY-NC-4.0
273OpenAIgpt-3.5-turbo-0125OpenAI1198.00+/-88,626OpenAIProprietary
274REreka-flash-21b-20240226Reka AI1198.00+/-113,363Reka AIProprietary
275Googlegemini-proGoogle1197.00+/-19993GoogleProprietary
276IBgranite-3.1-2b-instructIBM1197.00+/-26391IBMApache 2.0
277IBgranite-3.0-8b-instructIBM1196.00+/-19873IBMApache 2.0
278HUzephyr-orpo-141b-A35b-v0.1HuggingFace1195.00+/-22589HuggingFaceApache 2.0
279DAdbrx-instruct-previewDatabricks1195.00+/-114,001DatabricksDBRX LICENSE
280Googlegemini-pro-dev-apiGoogle1194.00+/-142,274GoogleProprietary
281Microsoftphi-3-mini-4k-instruct-june-2024Microsoft1193.00+/-141,568MicrosoftMIT
282Microsoftphi-3-small-8k-instructMicrosoft1193.00+/-132,092MicrosoftMIT
283Metallama-3-8b-instructMeta1191.00+/-814,252MetaLlama 3 Community
284Mistralmixtral-8x7b-instruct-v0.1Mistral1190.00+/-99,663MistralApache 2.0
285IBgranite-3.1-8b-instructIBM1190.00+/-28382IBMApache 2.0
286Metallama-3.1-8b-instructMeta1189.00+/-87,135MetaLlama 3.1 Community
287AIjamba-1.5-miniAI21 Labs1185.00+/-161,094AI21 LabsJamba Open
288Coherecommand-rCohere1174.00+/-96,682CohereCC-BY-NC-4.0
289IBgranite-3.0-2b-instructIBM1167.00+/-19908IBMApache 2.0
290Alibabaqwen1.5-14b-chatAlibaba1166.00+/-132,184AlibabaQianwen LICENSE
291Metallama-3.2-3b-instructMeta1164.00+/-161,136MetaLlama 3.2
292SNsnowflake-arctic-instructSnowflake1161.00+/-114,793SnowflakeApache 2.0
293Googlegemma-2-2b-itGoogle1161.00+/-86,599GoogleGemma license
294Googlegemma-1.1-7b-itGoogle1158.00+/-113,039GoogleGemma license
295NEstarling-lm-7b-betaNexusflow1158.00+/-141,973NexusflowApache-2.0
296Microsoftwizardlm-70bMicrosoft1157.00+/-19903MicrosoftLlama 2 Community
297OPopenchat-3.5-0106OpenChat1157.00+/-141,726OpenChatApache-2.0
298DeepSeekdeepseek-llm-67b-chatDeepSeek1155.00+/-23576DeepSeekDeepSeek License
299HUsmollm2-1.7b-instructHuggingFace1151.00+/-33271HuggingFaceApache 2.0
300NOopenhermes-2.5-mistral-7bNousResearch1150.00+/-20697NousResearchApache-2.0
30101yi-34b-chat01 AI1150.00+/-132,04301 AIYi License
302Microsoftphi-3-mini-4k-instructMicrosoft1150.00+/-122,564MicrosoftMIT
303ALtulu-2-dpo-70bAllenAI/UW1144.00+/-19888AllenAI/UWAI2 ImpACT Low-risk
304Microsoftphi-3-mini-128k-instructMicrosoft1138.00+/-132,813MicrosoftMIT
305Metallama-2-70b-chatMeta1136.00+/-104,740MetaLlama 2 Community
306Mistralmistral-7b-instruct-v0.2Mistral1127.00+/-122,605MistralApache-2.0
307UCstarling-lm-7b-alphaUC Berkeley1126.00+/-161,300UC BerkeleyCC-BY-NC-4.0
308Alibabaqwen-14b-chatAlibaba1124.00+/-24534AlibabaQianwen LICENSE
309COdolphin-2.2.1-mistral-7bCognitive Computations1124.00+/-32219Cognitive ComputationsApache-2.0
310OPopenchat-3.5OpenChat1124.00+/-18945OpenChatApache-2.0
311Metallama-3.2-1b-instructMeta1123.00+/-161,162MetaLlama 3.2
312Alibabaqwen1.5-7b-chatAlibaba1120.00+/-20690AlibabaQianwen LICENSE
313Googlegemma-7b-itGoogle1116.00+/-161,120GoogleGemma license
314LMvicuna-33bLMSYS1115.00+/-132,663LMSYSNon-commercial
315Nvidiallama2-70b-steerlm-chatNvidia1114.00+/-27440NvidiaLlama 2 Community
316Googlepalm-2Google1113.00+/-19901GoogleProprietary
317Metallama-2-13b-chatMeta1109.00+/-132,218MetaLlama 2 Community
318UPsolar-10.7b-instruct-v1.0Upstage AI1108.00+/-22604Upstage AICC-BY-NC-4.0
319Metacodellama-34b-instructMeta1108.00+/-19770MetaLlama 2 Community
320Googlegemma-1.1-2b-itGoogle1105.00+/-161,355GoogleGemma license
321MOmpt-30b-chatMosaicML1094.00+/-34242MosaicMLCC-BY-NC-SA-4.0
322NOnous-hermes-2-mixtral-8x7b-dpoNousResearch1093.00+/-21628NousResearchApache-2.0
323Metallama-2-7b-chatMeta1085.00+/-141,656MetaLlama 2 Community
324Alibabaqwen1.5-4b-chatAlibaba1085.00+/-18988AlibabaQianwen LICENSE
325TOstripedhyena-nous-7bTogether AI1084.00+/-20676Together AIApache 2.0
326HUzephyr-7b-betaHuggingFace1082.00+/-171,250HuggingFaceMIT
327LMvicuna-13bLMSYS1082.00+/-142,146LMSYSLlama 2 Community
328Mistralmistral-7b-instructMistral1081.00+/-19974MistralApache 2.0
329UWguanaco-33bUW1079.00+/-32280UWNon-commercial
330Googlegemma-2b-itGoogle1068.00+/-22597GoogleGemma license
331Microsoftwizardlm-13bMicrosoft1064.00+/-21669MicrosoftLlama 2 Community
332AIolmo-7b-instructAi21054.00+/-19848Ai2Apache-2.0
333LMvicuna-7bLMSYS1046.00+/-22658LMSYSLlama 2 Community
334TSchatglm3-6bTsinghua1041.00+/-23576TsinghuaApache-2.0
335NOgpt4all-13b-snoozyNomic AI997.00+/-37211Nomic AINon-commercial
336STalpaca-13bStanford989.00+/-23652StanfordNon-commercial
337MOmpt-7b-chatMosaicML983.00+/-25471MosaicMLCC-BY-NC-SA-4.0
338RWRWKV-4-Raven-14BRWKV982.00+/-24544RWKVApache 2.0
339UCkoala-13bUC Berkeley979.00+/-21751UC BerkeleyNon-commercial
340TSchatglm-6bTsinghua976.00+/-25525TsinghuaNon-commercial
341TSchatglm2-6bTsinghua970.00+/-35227TsinghuaApache-2.0
342OPoasst-pythia-12bOpenAssistant958.00+/-22687OpenAssistantApache 2.0
343DAdolly-v2-12bDatabricks948.00+/-29370DatabricksMIT
344LMfastchat-t5-3bLMSYS918.00+/-26462LMSYSApache 2.0
345Metallama-13bMeta917.00+/-33252MetaNon-commercial
346STstablelm-tuned-alpha-7bStability AI889.00+/-29353Stability AICC-BY-NC-SA-4.0