LMArena Math Arena Leaderboard

The latest AI math reasoning leaderboard based on LMArena Math Arena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for Claude, GPT, Gemini, DeepSeek, Qwen, and more.

Top Model

Gemini 3.5 Flash

Top Score

1518.00

Model Count

355

Data version

2026年06月10日

Data source: LM Arena

About This Leaderboard

This leaderboard ranks AI models by mathematical reasoning ability. Data comes from LMArena's Math sub-track, evaluated through anonymous blind testing by real users on math problem-solving tasks.

Methodology Overview

Blind testing: Users submit math problems, two anonymous models provide solutions, and users vote for the better answer — eliminating brand bias.

Elo scoring: Uses the Bradley-Terry model to calculate Elo scores. Higher scores mean users more frequently prefer that model's math solutions.

Broad scenario coverage: Testing spans algebra, geometry, calculus, competition math, and more diverse real-world math tasks.

DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.

Origin:AllChina
Leaderboard snapshot month:

Ranking Table

RankModelScore95% CIVotesOrganizationLicense
Google Deep MindGemini 3.5 FlashGoogle Deep Mind1518.00+/-25582Google Deep MindProprietary
AnthropicClaude Opus 4.6 (thinking)Anthropic1516.00+/-132,324AnthropicProprietary
OpenAIGPT-5.4 (high)OpenAI1506.00+/-132,107OpenAIProprietary
4AnthropicClaude Opus 4.6Anthropic1503.00+/-122,675AnthropicProprietary
5Anthropicclaude-opus-4-8-thinkingAnthropic1498.00+/-26458AnthropicProprietary
6Google Deep MindGemini 3.1 Pro PreviewGoogle Deep Mind1498.00+/-113,167Google Deep MindProprietary
7OpenAIGPT-5.5 (high)OpenAI1495.00+/-161,386OpenAIProprietary
8AnthropicOpus 4.7Anthropic1495.00+/-151,644AnthropicProprietary
9AnthropicOpus 4.7 (thinking)Anthropic1494.00+/-151,594AnthropicProprietary
10Qwen3.7-Max-Preview阿里巴巴1492.00+/-40219阿里巴巴Proprietary
11OpenAIGPT-5.5OpenAI1489.00+/-161,409OpenAIProprietary
12Anthropicclaude-opus-4-8Anthropic1488.00+/-26482AnthropicProprietary
13Moonshot AIKimi K2.6Moonshot AI1483.00+/-171,231Moonshot AIModified MIT
14mimo-v2.5-proXiaomi1482.00+/-171,218XiaomiMIT
15ERNIE-5.1-Preview百度1482.00+/-171,196百度Proprietary
16Qwen3.6-Max-Preview阿里巴巴1479.00+/-30355阿里巴巴Proprietary
17Google Deep MindGemini 3.0 Pro (Preview 11-2025)Google Deep Mind1478.00+/-112,653Google Deep MindProprietary
18MiniMaxminimax-m3MiniMax1477.00+/-31383MiniMaxProprietary
19Google Deep MindGemini 3.0 FlashGoogle Deep Mind1477.00+/-132,002Google Deep MindProprietary
20Moonshot AIKimi K2 ThinkingMoonshot AI1473.00+/-122,653Moonshot AIModified MIT
21GLM 5.1智谱AI1473.00+/-19957智谱AIMIT
22xAIgrok-4.20-beta-0309-reasoningxAI1471.00+/-132,193xAIProprietary
23Qwen3.5 Max Preview阿里巴巴1470.00+/-161,344阿里巴巴Proprietary
24AnthropicClaude Opus 4 (thinking-32k)Anthropic1470.00+/-122,267AnthropicProprietary
25DeepSeek-AIDeepSeek-V4-Pro (thinking)DeepSeek-AI1469.00+/-171,210DeepSeek-AIMIT
26DeepMindGemma 4 31BDeepMind1469.00+/-28399DeepMindApache 2.0
27DeepMindGemma 4 26B A4BDeepMind1467.00+/-28372DeepMindApache 2.0
28AnthropicClaude Opus 4Anthropic1466.00+/-94,333AnthropicProprietary
29OpenAIGPT-5.5 InstantOpenAI1463.00+/-161,470OpenAIProprietary
30Muse SparkFacebook AI研究实验室1462.00+/-20853Facebook AI研究实验室Proprietary
31AnthropicClaude Sonnet 4.6Anthropic1460.00+/-132,121AnthropicProprietary
32OpenAIGPT-5.2 Pro (high)OpenAI1458.00+/-112,976OpenAIProprietary
33OpenAIGPT-5.4OpenAI1458.00+/-132,226OpenAIProprietary
34Google Deep MindGemini 3.0 Flash (minimal)Google Deep Mind1457.00+/-103,631Google Deep MindProprietary
35Qwen 3.6 Plus Preview阿里巴巴1456.00+/-151,536阿里巴巴Proprietary
36AnthropicClaude Sonnet 4.5 (thinking-32k)Anthropic1456.00+/-94,909AnthropicProprietary
37OpenAIGPT-5.1 Pro (high)OpenAI1455.00+/-122,500OpenAIProprietary
38mimo-v2-proXiaomi1454.00+/-151,625XiaomiProprietary
39OpenAIGPT-5.2OpenAI1453.00+/-132,077OpenAIProprietary
40xAIGrok 4.20 BetaxAI1452.00+/-151,596xAIProprietary
41xAIgrok-4.20-multi-agent-beta-0309xAI1452.00+/-132,159xAIProprietary
42DOLA Seed 2.0 Pro字节跳动Seed团队1450.00+/-122,713字节跳动Seed团队Proprietary
43Qwen3.5-397B-A17B阿里巴巴1449.00+/-122,445阿里巴巴Apache 2.0
44OpenAIOpenAI o3OpenAI1447.00+/-103,730OpenAIProprietary
45mimo-v2.5Xiaomi1444.00+/-161,297XiaomiMIT
46AnthropicOpus 4.1 (thinking-16k)Anthropic1443.00+/-113,026AnthropicProprietary
47xAIGrok 4.1 ThinkingxAI1443.00+/-103,826xAIProprietary
48Moonshot AIKimi K2.5 InstantMoonshot AI1442.00+/-25513Moonshot AIModified MIT
49Google Deep MindGemini 2.5 Pro Experimental 03-25Google Deep Mind1442.00+/-77,638Google Deep MindProprietary
50OpenAIGPT-5.4 mini (high)OpenAI1441.00+/-142,048OpenAIProprietary
51GLM-5智谱AI1440.00+/-151,401智谱AIMIT
52mimo-v2-omniXiaomi1440.00+/-29445XiaomiProprietary
53Nvidianvidia-nemotron-3-ultra-550b-a55b-nvfp4Nvidia1440.00+/-40207NvidiaOpenMDW-1.1
54Moonshot AIKimi K2 Thinking (thinking-turbo)Moonshot AI1439.00+/-103,779Moonshot AIModified MIT
55DeepSeek-AIDeepSeek-V4-ProDeepSeek-AI1439.00+/-161,471DeepSeek-AIMIT
56Qwen3 Max (Preview)阿里巴巴1439.00+/-151,525阿里巴巴Proprietary
57Googlegemini-3.1-flash-lite-previewGoogle1439.00+/-122,636GoogleProprietary
58Mistralmistral-medium-3.5Mistral1438.00+/-27459MistralModified MIT
59ERNIE 5.0百度1437.00+/-132,144百度Proprietary
60Meituanlongcat-flash-chat-2602-expMeituan1437.00+/-141,749MeituanProprietary
61DeepSeek-AIDeepSeek-V4-Flash (thinking)DeepSeek-AI1435.00+/-171,311DeepSeek-AIMIT
62OpenAIGPT-5-Pro (high)OpenAI1434.00+/-141,886OpenAIProprietary
63AnthropicOpus 4.1Anthropic1433.00+/-94,724AnthropicProprietary
64OpenAIGPT-5.4 nano (high)OpenAI1433.00+/-141,887OpenAIProprietary
65OpenAIGPT-5.2OpenAI1433.00+/-113,272OpenAIProprietary
66DeepSeek-AIDeepSeek-V4-FlashDeepSeek-AI1432.00+/-161,352DeepSeek-AIMIT
67DeepSeek-AIDeepSeek V3.2DeepSeek-AI1430.00+/-113,001DeepSeek-AIMIT
68Qwen3.5-27B阿里巴巴1430.00+/-151,644阿里巴巴Apache 2.0
69xAIGrok 4.1xAI1429.00+/-94,228xAIProprietary
70Tencenthunyuan-hy3-previewTencent1429.00+/-28406Tencenttencent-hunyuan-community
71Alibabaqwen3-max-2025-09-23Alibaba1429.00+/-24584AlibabaProprietary
72GLM-4.7智谱AI1428.00+/-21710智谱AIMIT
73Amazonamazon-nova-experimental-chat-26-02-10Amazon1428.00+/-39207AmazonProprietary
74xAIGrok 4xAI1428.00+/-122,266xAIProprietary
75DeepSeek-AIDeepSeek V3.2-Exp (thinking)DeepSeek-AI1428.00+/-26481DeepSeek-AIMIT
76AnthropicClaude Sonnet 4.5Anthropic1428.00+/-94,904AnthropicProprietary
77DeepSeek-AIDeepSeek V3.2 (thinking)DeepSeek-AI1426.00+/-122,501DeepSeek-AIMIT
78OpenAIGPT-5.3OpenAI1425.00+/-142,040OpenAIProprietary
79Qwen3.5-122B-A10B阿里巴巴1424.00+/-141,769阿里巴巴Apache 2.0
80xAIGrok 4 FastxAI1424.00+/-29399xAIProprietary
81OpenAIGPT-5.1 InstantOpenAI1424.00+/-112,866OpenAIProprietary
82xAIGrok 4.3 BetaxAI1422.00+/-171,258xAIProprietary
83GLM-4.6智谱AI1421.00+/-132,107智谱AIMIT
84AnthropicClaude Opus 4 (thinking-16k)Anthropic1420.00+/-122,240AnthropicProprietary
85Qwen3-235B-A22B-2507阿里巴巴1420.00+/-85,921阿里巴巴Apache 2.0
86Qwen3-Next阿里巴巴1419.00+/-171,212阿里巴巴Apache 2.0
87DeepSeek-AIDeepSeek V3.2-ExpDeepSeek-AI1418.00+/-21775DeepSeek-AIMIT
88xAIGrok 4.1 Fast (fast-reasoning)xAI1418.00+/-103,487xAIProprietary
89Meituanlongcat-flash-chatMeituan1417.00+/-22689MeituanMIT
90Moonshot AIKimi K2 0905Moonshot AI1416.00+/-21759Moonshot AIModified MIT
91OpenAIOpenAI o4 - miniOpenAI1416.00+/-112,937OpenAIProprietary
92MiniMaxAIMiniMax-M2.7MiniMaxAI1415.00+/-141,784MiniMaxAIModified MIT
93DeepSeek-AIDeepSeek-V3.1DeepSeek-AI1415.00+/-18993DeepSeek-AIMIT
94DeepSeek-AIDeepSeek-V3.1 (thinking)DeepSeek-AI1415.00+/-22663DeepSeek-AIMIT
95OpenAIGPT-5OpenAI1413.00+/-141,786OpenAIProprietary
96GLM-4.5智谱AI1413.00+/-151,425智谱AIMIT
97Google Deep MindGemini 2.5 Flash-Preview-09-2025Google Deep Mind1413.00+/-131,944Google Deep MindProprietary
98xAIGrok 4 Fast (fast-reasoning)xAI1412.00+/-181,084xAIProprietary
99Qwen3-VL-235B-A22B-Instruct阿里巴巴1411.00+/-23703阿里巴巴Apache 2.0
100DeepSeek-AIDeepSeek-R1DeepSeek-AI1411.00+/-141,606DeepSeek-AIMIT
101Amazonamazon-nova-experimental-chat-26-01-10Amazon1409.00+/-33263AmazonProprietary
102OpenAIGPT-4.5OpenAI1409.00+/-151,393OpenAIProprietary
103OpenAIOpenAI o1OpenAI1409.00+/-112,986OpenAIProprietary
104DeepSeek-AIDeepSeek-V3.1 Terminus (thinking)DeepSeek-AI1409.00+/-40200DeepSeek-AIMIT
105ERNIE 5.0百度1408.00+/-23620百度Proprietary
106StepFunAIStep 3.5 FlashStepFunAI1408.00+/-122,483StepFunAIApache 2.0
107Google Deep MindGemini 2.5 FlashGoogle Deep Mind1406.00+/-77,876Google Deep MindProprietary
108OpenAIOpenAI o3-mini (high)OpenAI1406.00+/-131,909OpenAIProprietary
109OpenAIGPT-5-mini (high)OpenAI1405.00+/-151,460OpenAIProprietary
110Qwen3-VL-235B-A22B-Instruct (thinking)阿里巴巴1405.00+/-28428阿里巴巴Apache 2.0
111StepFunAIStep 3.5 FlashStepFunAI1405.00+/-132,227StepFunAIProprietary
112OpenAIGPT-4o(2025-03-27)OpenAI1404.00+/-85,723OpenAIProprietary
113AnthropicClaude Opus 4Anthropic1403.00+/-112,769AnthropicProprietary
114AnthropicClaude Sonnet 4 (thinking-32k)Anthropic1403.00+/-132,023AnthropicProprietary
115MistralAIMistral Large 3MistralAI1402.00+/-112,787MistralAIApache 2.0
116Hunyuan-T1腾讯AI实验室1401.00+/-38236腾讯AI实验室Proprietary
117Qwen3.5-35B-A3B阿里巴巴1400.00+/-141,752阿里巴巴Apache 2.0
118Amazonamazon-nova-experimental-chat-12-10Amazon1400.00+/-37234AmazonProprietary
119ERNIE 5.0百度1400.00+/-34268百度Proprietary
120MistralAIMagistral-Medium-2506MistralAI1399.00+/-85,817MistralAIProprietary
121Qwen3-32B阿里巴巴1399.00+/-30316阿里巴巴Apache 2.0
122Amazonamazon-nova-experimental-chat-11-10Amazon1398.00+/-151,584AmazonProprietary
123Alibabaqwen3-235b-a22b-thinking-2507Alibaba1398.00+/-24489AlibabaApache 2.0
124AnthropicHaiku 4.5Anthropic1397.00+/-95,213AnthropicProprietary
125DeepSeek-AIDeepSeek-R1-0528DeepSeek-AI1396.00+/-20869DeepSeek-AIMIT
126MiniMaxAIMiniMax M2.5MiniMaxAI1396.00+/-122,429MiniMaxAIModified MIT
127DeepSeek-AIDeepSeek-V3.1 TerminusDeepSeek-AI1395.00+/-39219DeepSeek-AIMIT
128Amazonamazon-nova-experimental-chat-10-20Amazon1395.00+/-20806AmazonProprietary
129Alibabaqwen3-235b-a22b-no-thinkingAlibaba1395.00+/-122,390AlibabaApache 2.0
130Qwen3-235B-A22B阿里巴巴1393.00+/-141,604阿里巴巴Apache 2.0
131MiniMaxAIM2.1MiniMaxAI1393.00+/-181,010MiniMaxAIMIT
132GLM-4.5-Air智谱AI1390.00+/-151,540智谱AIMIT
133Nvidianvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia1390.00+/-39194NvidiaNvidia Open
134Qwen3-Next (thinking)阿里巴巴1390.00+/-20829阿里巴巴Apache 2.0
135AnthropicClaude Sonnet 4Anthropic1388.00+/-122,473AnthropicProprietary
136OpenAIOpenAI o3-mini (high)OpenAI1388.00+/-18977OpenAIProprietary
137Moonshot AIKimi K2Moonshot AI1388.00+/-141,693Moonshot AIModified MIT
138OpenAIOpenAI o1OpenAI1386.00+/-104,569OpenAIProprietary
139trinity-large-thinkingArcee AI1385.00+/-151,612Arcee AIApache 2.0
140AnthropicClaude Sonnet 3.7 (thinking-32k)Anthropic1384.00+/-112,793AnthropicProprietary
141intellect-3Prime Intellect1383.00+/-31334Prime IntellectMIT
142OpenAIGPT OSS 120BOpenAI1382.00+/-141,793OpenAIApache 2.0
143OpenAIOpenAI o3-miniOpenAI1382.00+/-84,720OpenAIProprietary
144Qwen3-30B-A3B-2507阿里巴巴1381.00+/-151,426阿里巴巴Apache 2.0
145Nvidiallama-3.1-nemotron-ultra-253b-v1Nvidia1380.00+/-37209NvidiaNvidia Open Model
146mimo-v2-flash (non-thinking)Xiaomi1379.00+/-112,839XiaomiMIT
147Qwen3-Coder-480B-A35B阿里巴巴1377.00+/-151,627阿里巴巴Apache 2.0
148Nvidianvidia-nemotron-3-super-120b-a12bNvidia1375.00+/-25515NvidiaNVIDIA Open Model
149xAIGrok 3xAI1375.00+/-112,677xAIProprietary
150mimo-v2-flash (thinking)Xiaomi1374.00+/-22633XiaomiMIT
151OpenAIGPT-4.1OpenAI1374.00+/-103,227OpenAIProprietary
152MiniMaxminimax-m1MiniMax1371.00+/-131,796MiniMaxApache 2.0
153DeepSeek-AIDeepSeek-V3-0324DeepSeek-AI1370.00+/-103,191DeepSeek-AIMIT
154xAIgrok-3-mini-betaxAI1369.00+/-141,529xAIProprietary
155GLM-4.7-Flash智谱AI1366.00+/-21717智谱AIMIT
156Google Deep MindGemini 2.5 Flash-Lite-Preview-09-2025 (no-thinking)Google Deep Mind1365.00+/-112,878Google Deep MindProprietary
157Google Deep MindGemini 2.5 Flash-Lite (thinking)Google Deep Mind1365.00+/-122,094Google Deep MindProprietary
158Qwen2.5-Max阿里巴巴1364.00+/-103,305阿里巴巴Proprietary
159QwQ-32B阿里巴巴1364.00+/-141,719阿里巴巴Apache 2.0
160StepFunAIStep3StepFunAI1364.00+/-31351StepFunAIApache 2.0
161AnthropicClaude Sonnet 3.7Anthropic1362.00+/-103,358AnthropicProprietary
162OpenAIOpenAI o1-miniOpenAI1362.00+/-87,499OpenAIProprietary
163trinity-large-previewArcee AI1361.00+/-141,885Arcee AIApache 2.0
164GLM-4.5V智谱AI1357.00+/-34277智谱AIMIT
165DeepMindGemini 2.0 Flash ExperimentalDeepMind1356.00+/-94,066DeepMindProprietary
166MiniMaxAIMiniMax M2MiniMaxAI1356.00+/-33319MiniMaxAIApache 2.0
167ling-flash-2.0Ant Group1355.00+/-27461Ant GroupMIT
168OpenAIGPT-4.1 miniOpenAI1355.00+/-112,693OpenAIProprietary
169Nvidianvidia-nemotron-3-nano-30b-a3b-bf16Nvidia1354.00+/-19987NvidiaNVIDIA Open Model
170Qwen3-30B-A3B阿里巴巴1353.00+/-141,707阿里巴巴Apache 2.0
171AnthropicClaude 3.5 SonnetAnthropic1350.00+/-710,017AnthropicProprietary
172Mistralmistral-medium-2505Mistral1349.00+/-122,228MistralProprietary
173Tencenthunyuan-turbos-20250416Tencent1348.00+/-20845TencentProprietary
174OpenAIGPT-5-Nano (high)OpenAI1345.00+/-27494OpenAIProprietary
175AnthropicClaude 3.5 SonnetAnthropic1341.00+/-711,359AnthropicProprietary
176ring-flash-2.0Ant Group1340.00+/-27454Ant GroupMIT
177MistralAIMistral-Small-3.2MistralAI1339.00+/-181,042MistralAIApache 2.0
178Google Deep MindGemini 1.5 ProGoogle Deep Mind1339.00+/-77,610Google Deep MindProprietary
179OpenAIGPT OSS 20BOpenAI1336.00+/-22680OpenAIApache 2.0
180Nova 2 Lite亚马逊1335.00+/-20825亚马逊Proprietary
181DeepMindGemini 2.0 Flash-LiteDeepMind1326.00+/-102,814DeepMindProprietary
182Alibabaqwen-plus-0125Alibaba1324.00+/-19732AlibabaProprietary
183Google Deep MindGemma 3 - 27B (IT)Google Deep Mind1322.00+/-93,581Google Deep MindGemma
184granite-4.1-8bIBM1320.00+/-39236IBMApache 2.0
185Metallama-3.1-405b-instruct-fp8Meta1319.00+/-88,482MetaLlama 3.1 Community
186Llama 4 Maverick InstructFacebook AI研究实验室1319.00+/-112,838Facebook AI研究实验室Llama 4
187Google Deep MindGemma 3 - 12B (IT)Google Deep Mind1317.00+/-27389Google Deep MindGemma
188Metallama-3.1-405b-instruct-bf16Meta1315.00+/-85,215MetaLlama 3.1 Community
189StepFunstep-2-16k-exp-202412StepFun1313.00+/-20642StepFunProprietary
190athene-v2-chatNexusFlow1312.00+/-93,412NexusFlowNexusFlow
191AnthropicClaude3-OpusAnthropic1312.00+/-625,769AnthropicProprietary
192olmo-3-32b-thinkAi21311.00+/-32314Ai2Apache 2.0
193DeepSeek-AIDeepSeek-V3DeepSeek-AI1311.00+/-112,721DeepSeek-AIDeepSeek
194CohereAIC4AI Command A (202503)CohereAI1309.00+/-93,994CohereAICC-BY-NC-4.0
195Llama 4 Scout InstructFacebook AI研究实验室1309.00+/-131,944Facebook AI研究实验室Llama
196OpenAIGPT-4oOpenAI1308.00+/-86,826OpenAIProprietary
197olmo-3.1-32b-instructAi21306.00+/-23696Ai2Apache 2.0
198yi-lightning01 AI1306.00+/-103,92101 AIProprietary
199Googlegemini-advanced-0514Google1305.00+/-106,395GoogleProprietary
200OpenAIGPT-4oOpenAI1305.00+/-715,103OpenAIProprietary
201Alibabaqwen2.5-plus-1127Alibaba1305.00+/-141,404AlibabaProprietary
202OpenAIGPT-4OpenAI1303.00+/-813,306OpenAIProprietary
203Tencenthunyuan-turbos-20250226Tencent1302.00+/-31238TencentProprietary
204StepFunstep-1o-turbo-202506StepFun1299.00+/-24564StepFunProprietary
205OpenAIGPT-4OpenAI1299.00+/-812,374OpenAIProprietary
206glm-4-plus-0111Zhipu1298.00+/-19721ZhipuProprietary
207Google Deep MindGemini 1.5 ProGoogle Deep Mind1298.00+/-810,492Google Deep MindProprietary
208olmo-3.1-32b-thinkAi21297.00+/-26473Ai2Apache 2.0
209Qwen2.5-VL-72B-Instruct阿里巴巴1297.00+/-85,415阿里巴巴Qwen
210OpenAIgpt-4-turbo-2024-04-09OpenAI1296.00+/-813,217OpenAIProprietary
211Llama3.3-70B-InstructFacebook AI研究实验室1296.00+/-85,777Facebook AI研究实验室Llama-3.3
212xAIGrok 2xAI1294.00+/-78,950xAIProprietary
213Tencenthunyuan-large-2025-02-10Tencent1294.00+/-24497TencentProprietary
214DeepSeekdeepseek-v2.5-1210DeepSeek1293.00+/-171,031DeepSeekDeepSeek
215Alibabaqwen-max-0919Alibaba1292.00+/-122,249AlibabaQwen
216Tencenthunyuan-standard-2025-02-10Tencent1290.00+/-24499TencentProprietary
217Googlegemini-1.5-flash-002Google1288.00+/-94,789GoogleProprietary
218Mistralmistral-large-2407Mistral1288.00+/-86,664MistralMistral Research
219DeepSeek-AIDeepSeek V2.5DeepSeek-AI1288.00+/-103,649DeepSeek-AIDeepSeek
220glm-4-plusZhipu AI1287.00+/-103,599Zhipu AIProprietary
221AnthropicClaude 3.5 HaikuAnthropic1286.00+/-76,364AnthropicProprietary
222MistralAIMagistral-Medium-2506MistralAI1285.00+/-26553MistralAIProprietary
223OpenAIGPT-4OpenAI1283.00+/-107,052OpenAIProprietary
224Mistralmistral-large-2411Mistral1282.00+/-93,574MistralMRL
225Tencenthunyuan-large-visionTencent1280.00+/-30351TencentProprietary
226Tencenthunyuan-turbo-0110Tencent1279.00+/-31243TencentProprietary
227ibm-granite-h-smallIBM1279.00+/-32358IBMApache 2.0
228Llama3.1-70B-InstructFacebook AI研究实验室1279.00+/-171,041Facebook AI研究实验室Llama 3.1
229MistralAIMistral-Small-3.1-24B-Instruct-2503MistralAI1278.00+/-132,131MistralAIApache 2.0
230OpenAIGPT-4o miniOpenAI1276.00+/-79,322OpenAIProprietary
231OpenAIGPT-4OpenAI1275.00+/-811,181OpenAIProprietary
232OpenAIGPT-4.1 nanoOpenAI1274.00+/-23582OpenAIProprietary
233Qwen2-72B-Instruct阿里巴巴1273.00+/-94,835阿里巴巴Qianwen LICENSE
234xAIgrok-2-mini-2024-08-13xAI1273.00+/-87,261xAIProprietary
235DeepSeekdeepseek-coder-v2DeepSeek1271.00+/-131,858DeepSeekDeepSeek License
236Nvidiallama-3.1-nemotron-51b-instructNvidia1271.00+/-22507NvidiaLlama 3.1
237Qwen2.5-Coder-32B-Instruct阿里巴巴1270.00+/-19725阿里巴巴Apache 2.0
238Amazonamazon-nova-pro-v1.0Amazon1269.00+/-102,978AmazonProprietary
239Llama3.1-70B-InstructFacebook AI研究实验室1269.00+/-87,677Facebook AI研究实验室Llama 3.1 Community
240Microsoft AzurePhi 4 - 14BMicrosoft Azure1265.00+/-102,764Microsoft AzureMIT
241llama-3.1-tulu-3-70bAi21264.00+/-25397Ai2Llama 3.1
242MistralAIMistral Small 24B Instruct 2501MistralAI1261.00+/-131,683MistralAIApache 2.0
243athene-70b-0725NexusFlow1261.00+/-102,921NexusFlowCC-BY-NC-4.0
244Google Deep MindGemma-3n-E4BGoogle Deep Mind1260.00+/-151,573Google Deep MindGemma
245Llama3-70B-InstructFacebook AI研究实验室1257.00+/-720,941Facebook AI研究实验室Llama 3 Community
246Googlegemini-1.5-flash-001Google1257.00+/-88,392GoogleProprietary
247Google Deep MindGemma 3 - 4B (IT)Google Deep Mind1254.00+/-28423Google Deep MindGemma
248AnthropicClaude3-SonnetAnthropic1253.00+/-813,766AnthropicProprietary
249Nvidianemotron-4-340b-instructNvidia1252.00+/-122,352NvidiaNVIDIA Open Model
250Tencenthunyuan-standard-256kTencent1250.00+/-29361TencentProprietary
251GLM4智谱AI1247.00+/-161,191智谱AIProprietary
252reka-core-20240904Reka AI1246.00+/-141,207Reka AIProprietary
253Googlegemma-2-27b-itGoogle1245.00+/-710,170GoogleGemma license
254jamba-1.5-largeAI21 Labs1245.00+/-151,147AI21 LabsJamba Open
255Amazonamazon-nova-lite-v1.0Amazon1244.00+/-112,511AmazonProprietary
256Mistralmistral-large-2402Mistral1244.00+/-97,987MistralProprietary
257CohereAIC4AI Aya Vision 32BCohereAI1232.00+/-103,854CohereAICC-BY-NC-4.0
258reka-flash-20240904Reka AI1232.00+/-141,284Reka AIProprietary
259AnthropicClaude3-HaikuAnthropic1231.00+/-714,983AnthropicProprietary
260Coherecommand-r-plus-08-2024Cohere1231.00+/-141,467CohereCC-BY-NC-4.0
261Googlegemini-1.5-flash-8b-001Google1229.00+/-85,036GoogleProprietary
262MistralAIMixtral-8x22B-Instruct-v0.1MistralAI1228.00+/-96,778MistralAIApache 2.0
263olmo-2-0325-32b-instructAi21227.00+/-28375Ai2Apache-2.0
264Amazonamazon-nova-micro-v1.0Amazon1224.00+/-112,455AmazonProprietary
265Qwen1.5-110B-Chat阿里巴巴1221.00+/-113,188阿里巴巴Qianwen LICENSE
266Mistralmistral-mediumMistral1220.00+/-114,406MistralProprietary
267Googlegemma-2-9b-itGoogle1218.00+/-87,110GoogleGemma license
268Microsoft AzurePhi-3-medium 14B-previewMicrosoft Azure1215.00+/-113,238Microsoft AzureMIT
269Mistralministral-8b-2410Mistral1214.00+/-20683MistralMRL
270CohereAIC4AI Command R+CohereAI1213.00+/-89,769CohereAICC-BY-NC-4.0
271Yi-1.5-34B零一万物1213.00+/-112,985零一万物Apache-2.0
272QwQ-32B-Preview阿里巴巴1212.00+/-24480阿里巴巴Apache 2.0
273reka-flash-21b-20240226-onlineReka AI1211.00+/-142,028Reka AIProprietary
274Qwen1.5-72B-Chat阿里巴巴1208.00+/-105,327阿里巴巴Qianwen LICENSE
275InternLM2-Base-20B上海人工智能实验室1207.00+/-151,387上海人工智能实验室Other
276llama-3.1-tulu-3-8bAi21207.00+/-26363Ai2Llama 3.1
277Coherecommand-r-08-2024Cohere1206.00+/-141,601CohereCC-BY-NC-4.0
278gemma-2-9b-it-simpoPrinceton1205.00+/-151,285PrincetonMIT
279OpenAIgpt-3.5-turbo-1106OpenAI1203.00+/-152,134OpenAIProprietary
280Alibabaqwen1.5-32b-chatAlibaba1200.00+/-122,649AlibabaQianwen LICENSE
281CohereAIC4AI Aya Vision 8BCohereAI1200.00+/-151,307CohereAICC-BY-NC-4.0
282OpenAIgpt-3.5-turbo-0125OpenAI1200.00+/-88,626OpenAIProprietary
283DeepMindGemini-proDeepMind1199.00+/-19993DeepMindProprietary
284reka-flash-21b-20240226Reka AI1199.00+/-113,363Reka AIProprietary
285granite-3.1-2b-instructIBM1197.00+/-26391IBMApache 2.0
286granite-3.0-8b-instructIBM1197.00+/-19873IBMApache 2.0
287zephyr-orpo-141b-A35b-v0.1HuggingFace1196.00+/-22589HuggingFaceApache 2.0
288DBRX Instructdatabricks1195.00+/-114,001databricksDBRX LICENSE
289Googlegemini-pro-dev-apiGoogle1195.00+/-142,274GoogleProprietary
290Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1193.00+/-141,568Microsoft AzureMIT
291Microsoft AzurePhi-3-small 7BMicrosoft Azure1193.00+/-132,092Microsoft AzureMIT
292Llama3-8B-InstructFacebook AI研究实验室1192.00+/-814,252Facebook AI研究实验室Llama 3 Community
293Mistralmixtral-8x7b-instruct-v0.1Mistral1191.00+/-99,663MistralApache 2.0
294Llama3.1-8B-InstructFacebook AI研究实验室1190.00+/-28382Facebook AI研究实验室Apache 2.0
295Llama3.1-8B-InstructFacebook AI研究实验室1189.00+/-87,135Facebook AI研究实验室Llama 3.1 Community
296jamba-1.5-miniAI21 Labs1186.00+/-161,094AI21 LabsJamba Open
297Coherecommand-rCohere1175.00+/-96,682CohereCC-BY-NC-4.0
298Qwen3-VL-2B阿里巴巴1168.00+/-19908阿里巴巴Apache 2.0
299Qwen1.5-14B-Chat阿里巴巴1167.00+/-132,184阿里巴巴Qianwen LICENSE
300Metallama-3.2-3b-instructMeta1165.00+/-161,136MetaLlama 3.2
301Googlegemma-2-2b-itGoogle1162.00+/-86,599GoogleGemma license
302snowflake-arctic-instructSnowflake1162.00+/-114,793SnowflakeApache 2.0
303Google ResearchGemma 1.1-7B-ITGoogle Research1160.00+/-113,039Google ResearchGemma license
304openchat-3.5-0106OpenChat1158.00+/-141,726OpenChatApache-2.0
305starling-lm-7b-betaNexusflow1158.00+/-141,973NexusflowApache-2.0
306WizardLM-70B-V1.0WizardLM Team1157.00+/-19903WizardLM TeamLlama 2 Community
307DeepSeek-AIDeepSeek LLM 67B ChatDeepSeek-AI1155.00+/-23576DeepSeek-AIDeepSeek License
308smollm2-1.7b-instructHuggingFace1152.00+/-33271HuggingFaceApache 2.0
309openhermes-2.5-mistral-7bNousResearch1151.00+/-20697NousResearchApache-2.0
310Yi-34B零一万物1151.00+/-132,043零一万物Yi License
311Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1150.00+/-122,564Microsoft AzureMIT
312tulu-2-dpo-70bAllenAI/UW1145.00+/-19888AllenAI/UWAI2 ImpACT Low-risk
313Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1139.00+/-132,813Microsoft AzureMIT
314Metallama-2-70b-chatMeta1136.00+/-104,740MetaLlama 2 Community
315MistralAIMistral-7B-Instruct-v0.2MistralAI1127.00+/-122,605MistralAIApache-2.0
316starling-lm-7b-alphaUC Berkeley1126.00+/-161,300UC BerkeleyCC-BY-NC-4.0
317Qwen-14B-Chat阿里巴巴1125.00+/-24534阿里巴巴Qianwen LICENSE
318dolphin-2.2.1-mistral-7bCognitive Computations1125.00+/-32219Cognitive ComputationsApache-2.0
319openchat-3.5OpenChat1125.00+/-18945OpenChatApache-2.0
320Metallama-3.2-1b-instructMeta1124.00+/-161,162MetaLlama 3.2
321Qwen1.5-7B-Chat阿里巴巴1120.00+/-21690阿里巴巴Qianwen LICENSE
322Google ResearchGemma 7B - ItGoogle Research1117.00+/-161,120Google ResearchGemma license
323Vicuna 33BLM-SYS1115.00+/-132,663LM-SYSNon-commercial
324Google ResearchPaLM 2Google Research1115.00+/-19901Google ResearchProprietary
325Nvidiallama2-70b-steerlm-chatNvidia1114.00+/-27440NvidiaLlama 2 Community
326Baichuan2-13B-Chat百川智能1110.00+/-132,218百川智能Llama 2 Community
327solar-10.7b-instruct-v1.0Upstage AI1109.00+/-22604Upstage AICC-BY-NC-4.0
328CodeLLaMA-34BFacebook AI研究实验室1109.00+/-19770Facebook AI研究实验室Llama 2 Community
329Google ResearchGemma 1.1-2B-ITGoogle Research1107.00+/-161,355Google ResearchGemma license
330MPT-30B-ChatMosaicML1095.00+/-34242MosaicMLCC-BY-NC-SA-4.0
331nous-hermes-2-mixtral-8x7b-dpoNousResearch1093.00+/-21628NousResearchApache-2.0
332Baichuan2-7B-Chat百川智能1086.00+/-141,656百川智能Llama 2 Community
333Qwen1.5-4B-Chat阿里巴巴1086.00+/-18988阿里巴巴Qianwen LICENSE
334stripedhyena-nous-7bTogether AI1084.00+/-20676Together AIApache 2.0
335Vicuna 13BLM-SYS1082.00+/-142,146LM-SYSLlama 2 Community
336zephyr-7b-betaHuggingFace1082.00+/-171,250HuggingFaceMIT
337MistralAIMistral 7B InstructMistralAI1081.00+/-19974MistralAIApache 2.0
338guanaco-33bUW1080.00+/-32280UWNon-commercial
339Google ResearchGemma 2B - ItGoogle Research1070.00+/-22597Google ResearchGemma license
340Microsoftwizardlm-13bMicrosoft1064.00+/-21669MicrosoftLlama 2 Community
341olmo-7b-instructAi21054.00+/-19848Ai2Apache-2.0
342Vicuna 7BLM-SYS1047.00+/-22658LM-SYSLlama 2 Community
343ChatGLM3-6B智谱AI1042.00+/-23576智谱AIApache-2.0
344GPT4All 13BNomic AI998.00+/-37211Nomic AINon-commercial
345alpaca-13bStanford991.00+/-23652StanfordNon-commercial
346MPT-7B-ChatMosaicML985.00+/-25471MosaicMLCC-BY-NC-SA-4.0
347RWKV-4-Raven-14BRWKV983.00+/-24544RWKVApache 2.0
348Koala达摩院980.00+/-21751达摩院Non-commercial
349ChatGLM-6B智谱AI976.00+/-26525智谱AINon-commercial
350ChatGLM2-6B智谱AI971.00+/-35227智谱AIApache-2.0
351oasst-pythia-12bOpenAssistant960.00+/-22687OpenAssistantApache 2.0
352dolly-v2-12bDatabricks950.00+/-29370DatabricksMIT
353fastchat-t5-3bLMSYS919.00+/-26462LMSYSApache 2.0
354LLaMA 13BFacebook AI研究实验室919.00+/-33252Facebook AI研究实验室Non-commercial
355stablelm-tuned-alpha-7bStability AI890.00+/-29353Stability AICC-BY-NC-SA-4.0

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

FAQ

01

What is LMArena Math Arena?

LMArena Math Arena is an anonymous evaluation track focused on mathematical reasoning. Users submit real math questions, compare hidden model solutions side by side, and vote for the better answer; the leaderboard is then calculated with Elo-style scoring.

02

How is Math Arena different from MATH-500 or AIME?

Static benchmarks such as MATH-500 and AIME use fixed problem sets and automated grading. Math Arena uses open-ended user questions and human preference voting, making it a useful complement for measuring how models handle varied real-world math tasks.

03

Do thinking models perform better in Math Arena?

Models with extended reasoning or chain-of-thought style capabilities often rank higher on math tasks because they spend more time decomposing and checking solutions. That benefit can come with higher latency and cost.

04

How do China-developed models perform in math?

DeepSeek, Qwen, GLM, and related models have become competitive in math reasoning leaderboards. Open licenses and Chinese-language support can make them especially useful for local deployment and education scenarios.