LMArena Math Arena Leaderboard

The latest AI math reasoning leaderboard based on LMArena Math Arena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for Claude, GPT, Gemini, DeepSeek, Qwen, and more.

Top Model

Claude Fable 5

Top Score

1517.00

Model Count

356

Data version

2026年06月16日

Data source: LM Arena

About This Leaderboard

This leaderboard ranks AI models by mathematical reasoning ability. Data comes from LMArena's Math sub-track, evaluated through anonymous blind testing by real users on math problem-solving tasks.

Methodology Overview

Blind testing: Users submit math problems, two anonymous models provide solutions, and users vote for the better answer — eliminating brand bias.

Elo scoring: Uses the Bradley-Terry model to calculate Elo scores. Higher scores mean users more frequently prefer that model's math solutions.

Broad scenario coverage: Testing spans algebra, geometry, calculus, competition math, and more diverse real-world math tasks.

DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.

Origin:AllChina
Leaderboard snapshot month:

Ranking Table

RankModelScore95% CIVotesOrganizationLicense
AnthropicClaude Fable 5Anthropic1517.00+/-37244AnthropicProprietary
Google Deep MindGemini 3.5 FlashGoogle Deep Mind1516.00+/-25584Google Deep MindProprietary
AnthropicClaude Opus 4.6 (thinking)Anthropic1516.00+/-122,502AnthropicProprietary
4AnthropicClaude Opus 4.6Anthropic1504.00+/-122,867AnthropicProprietary
5AnthropicOpus 4.7 (thinking)Anthropic1504.00+/-141,779AnthropicProprietary
6OpenAIGPT-5.4 (high)OpenAI1503.00+/-132,285OpenAIProprietary
7AnthropicOpus 4.7Anthropic1498.00+/-141,836AnthropicProprietary
8AnthropicClaude Opus 4.8 (thinking)Anthropic1496.00+/-22648AnthropicProprietary
9AnthropicClaude Opus 4.8Anthropic1495.00+/-23648AnthropicProprietary
10Google Deep MindGemini 3.1 Pro PreviewGoogle Deep Mind1495.00+/-113,429Google Deep MindProprietary
11OpenAIGPT-5.5 (high)OpenAI1494.00+/-151,569OpenAIProprietary
12Qwen3.7-Max-Preview阿里巴巴1492.00+/-40219阿里巴巴Proprietary
13OpenAIGPT-5.5OpenAI1490.00+/-151,574OpenAIProprietary
14mimo-v2.5-proXiaomi1485.00+/-161,384XiaomiMIT
15Moonshot AIKimi K2.6Moonshot AI1483.00+/-161,372Moonshot AIModified MIT
16ERNIE-5.1-Preview百度1481.00+/-161,346百度Proprietary
17Google Deep MindGemini 3.0 Pro (Preview 11-2025)Google Deep Mind1478.00+/-112,652Google Deep MindProprietary
18DeepSeek-AIDeepSeek-V4-Pro (thinking)DeepSeek-AI1477.00+/-161,391DeepSeek-AIMIT
19Qwen3.6-Max-Preview阿里巴巴1476.00+/-30358阿里巴巴Proprietary
20Google Deep MindGemini 3.0 FlashGoogle Deep Mind1476.00+/-132,002Google Deep MindProprietary
21GLM 5.1智谱AI1474.00+/-19966智谱AIMIT
22Moonshot AIKimi K2 ThinkingMoonshot AI1472.00+/-112,818Moonshot AIModified MIT
23xAIgrok-4.20-beta-0309-reasoningxAI1470.00+/-132,399xAIProprietary
24AnthropicClaude Opus 4 (thinking-32k)Anthropic1470.00+/-122,266AnthropicProprietary
25Qwen3.5 Max Preview阿里巴巴1469.00+/-161,352阿里巴巴Proprietary
26DeepMindGemma 4 31BDeepMind1469.00+/-28399DeepMindApache 2.0
27DeepMindGemma 4 26B A4BDeepMind1467.00+/-28372DeepMindApache 2.0
28AnthropicClaude Opus 4Anthropic1466.00+/-94,343AnthropicProprietary
29OpenAIGPT-5.5 InstantOpenAI1463.00+/-161,472OpenAIProprietary
30Muse SparkFacebook AI研究实验室1463.00+/-20862Facebook AI研究实验室Proprietary
31MiniMaxminimax-m3MiniMax1461.00+/-26556MiniMaxProprietary
32OpenAIGPT-5.4OpenAI1460.00+/-122,433OpenAIProprietary
33AnthropicClaude Sonnet 4.6Anthropic1459.00+/-132,296AnthropicProprietary
34OpenAIGPT-5.2 Pro (high)OpenAI1458.00+/-112,990OpenAIProprietary
35AnthropicClaude Sonnet 4.5 (thinking-32k)Anthropic1455.00+/-94,913AnthropicProprietary
36Google Deep MindGemini 3.0 Flash (minimal)Google Deep Mind1455.00+/-103,814Google Deep MindProprietary
37OpenAIGPT-5.1 Pro (high)OpenAI1455.00+/-122,500OpenAIProprietary
38OpenAIGPT-5.2OpenAI1453.00+/-132,084OpenAIProprietary
39Qwen 3.6 Plus Preview阿里巴巴1453.00+/-141,720阿里巴巴Proprietary
40mimo-v2-proXiaomi1452.00+/-151,632XiaomiProprietary
41xAIgrok-4.20-multi-agent-beta-0309xAI1451.00+/-132,367xAIProprietary
42xAIGrok 4.20 BetaxAI1451.00+/-151,609xAIProprietary
43DOLA Seed 2.0 Pro字节跳动Seed团队1449.00+/-112,913字节跳动Seed团队Proprietary
44mimo-v2.5Xiaomi1448.00+/-151,467XiaomiMIT
45OpenAIOpenAI o3OpenAI1447.00+/-103,728OpenAIProprietary
46Qwen3.5-397B-A17B阿里巴巴1447.00+/-122,614阿里巴巴Apache 2.0
47Nvidianvidia-nemotron-3-ultra-550b-a55b-nvfp4Nvidia1445.00+/-31347NvidiaOpenMDW-1.1
48AnthropicOpus 4.1 (thinking-16k)Anthropic1444.00+/-113,025AnthropicProprietary
49mimo-v2-omniXiaomi1443.00+/-25598XiaomiProprietary
50xAIGrok 4.1 ThinkingxAI1443.00+/-103,833xAIProprietary
51Moonshot AIKimi K2.5 InstantMoonshot AI1442.00+/-25513Moonshot AIModified MIT
52Google Deep MindGemini 2.5 Pro Experimental 03-25Google Deep Mind1442.00+/-77,644Google Deep MindProprietary
53Googlegemini-3.1-flash-lite-previewGoogle1442.00+/-112,855GoogleProprietary
54OpenAIGPT-5.4 mini (high)OpenAI1441.00+/-132,233OpenAIProprietary
55GLM-5智谱AI1440.00+/-151,406智谱AIMIT
56Qwen3 Max (Preview)阿里巴巴1439.00+/-151,525阿里巴巴Proprietary
57Moonshot AIKimi K2 Thinking (thinking-turbo)Moonshot AI1438.00+/-103,785Moonshot AIModified MIT
58DeepSeek-AIDeepSeek-V4-ProDeepSeek-AI1437.00+/-151,651DeepSeek-AIMIT
59ERNIE 5.0百度1437.00+/-132,150百度Proprietary
60DeepSeek-AIDeepSeek-V4-Flash (thinking)DeepSeek-AI1436.00+/-161,511DeepSeek-AIMIT
61Meituanlongcat-flash-chat-2602-expMeituan1436.00+/-141,753MeituanProprietary
62OpenAIGPT-5-Pro (high)OpenAI1434.00+/-141,887OpenAIProprietary
63OpenAIGPT-5.2OpenAI1433.00+/-103,461OpenAIProprietary
64AnthropicOpus 4.1Anthropic1433.00+/-94,724AnthropicProprietary
65Mistralmistral-medium-3.5Mistral1433.00+/-25519MistralModified MIT
66OpenAIGPT-5.4 nano (high)OpenAI1432.00+/-132,079OpenAIProprietary
67Alibabaqwen3-max-2025-09-23Alibaba1430.00+/-24582AlibabaProprietary
68DeepSeek-AIDeepSeek V3.2DeepSeek-AI1430.00+/-113,004DeepSeek-AIMIT
69Qwen3.5-27B阿里巴巴1429.00+/-151,653阿里巴巴Apache 2.0
70xAIGrok 4.1xAI1429.00+/-94,235xAIProprietary
71Tencenthunyuan-hy3-previewTencent1429.00+/-28405Tencenttencent-hunyuan-community
72GLM-4.7智谱AI1428.00+/-21710智谱AIMIT
73AnthropicClaude Sonnet 4.5Anthropic1428.00+/-94,913AnthropicProprietary
74xAIGrok 4xAI1428.00+/-122,263xAIProprietary
75DeepSeek-AIDeepSeek V3.2-Exp (thinking)DeepSeek-AI1428.00+/-26481DeepSeek-AIMIT
76Amazonamazon-nova-experimental-chat-26-02-10Amazon1428.00+/-39207AmazonProprietary
77DeepSeek-AIDeepSeek-V4-FlashDeepSeek-AI1427.00+/-151,523DeepSeek-AIMIT
78DeepSeek-AIDeepSeek V3.2 (thinking)DeepSeek-AI1426.00+/-122,506DeepSeek-AIMIT
79OpenAIGPT-5.3OpenAI1425.00+/-132,046OpenAIProprietary
80Qwen3.5-122B-A10B阿里巴巴1424.00+/-141,779阿里巴巴Apache 2.0
81OpenAIGPT-5.1 InstantOpenAI1424.00+/-112,866OpenAIProprietary
82xAIGrok 4 FastxAI1423.00+/-29398xAIProprietary
83GLM-4.6智谱AI1421.00+/-132,107智谱AIMIT
84AnthropicClaude Opus 4 (thinking-16k)Anthropic1420.00+/-122,239AnthropicProprietary
85Qwen3-235B-A22B-2507阿里巴巴1420.00+/-85,924阿里巴巴Apache 2.0
86Qwen3-Next阿里巴巴1419.00+/-171,211阿里巴巴Apache 2.0
87xAIGrok 4.3 BetaxAI1418.00+/-161,454xAIProprietary
88DeepSeek-AIDeepSeek V3.2-ExpDeepSeek-AI1418.00+/-21775DeepSeek-AIMIT
89xAIGrok 4.1 Fast (fast-reasoning)xAI1417.00+/-103,500xAIProprietary
90Meituanlongcat-flash-chatMeituan1417.00+/-22689MeituanMIT
91Moonshot AIKimi K2 0905Moonshot AI1416.00+/-21759Moonshot AIModified MIT
92OpenAIOpenAI o4 - miniOpenAI1415.00+/-112,939OpenAIProprietary
93DeepSeek-AIDeepSeek-V3.1DeepSeek-AI1415.00+/-18992DeepSeek-AIMIT
94MiniMaxAIMiniMax-M2.7MiniMaxAI1415.00+/-141,953MiniMaxAIModified MIT
95DeepSeek-AIDeepSeek-V3.1 (thinking)DeepSeek-AI1415.00+/-22663DeepSeek-AIMIT
96GLM-4.5智谱AI1413.00+/-151,425智谱AIMIT
97OpenAIGPT-5OpenAI1413.00+/-141,785OpenAIProprietary
98Google Deep MindGemini 2.5 Flash-Preview-09-2025Google Deep Mind1412.00+/-131,944Google Deep MindProprietary
99xAIGrok 4 Fast (fast-reasoning)xAI1412.00+/-181,084xAIProprietary
100DeepSeek-AIDeepSeek-R1DeepSeek-AI1411.00+/-141,606DeepSeek-AIMIT
101Qwen3-VL-235B-A22B-Instruct阿里巴巴1411.00+/-23704阿里巴巴Apache 2.0
102Amazonamazon-nova-experimental-chat-26-01-10Amazon1409.00+/-33263AmazonProprietary
103OpenAIGPT-4.5OpenAI1409.00+/-151,393OpenAIProprietary
104OpenAIOpenAI o1OpenAI1409.00+/-112,986OpenAIProprietary
105StepFunAIStep 3.5 FlashStepFunAI1408.00+/-122,641StepFunAIApache 2.0
106ERNIE 5.0百度1408.00+/-23618百度Proprietary
107DeepSeek-AIDeepSeek-V3.1 Terminus (thinking)DeepSeek-AI1407.00+/-41197DeepSeek-AIMIT
108Google Deep MindGemini 2.5 FlashGoogle Deep Mind1406.00+/-77,879Google Deep MindProprietary
109OpenAIOpenAI o3-mini (high)OpenAI1406.00+/-131,909OpenAIProprietary
110OpenAIGPT-5-mini (high)OpenAI1405.00+/-151,459OpenAIProprietary
111Qwen3-VL-235B-A22B-Instruct (thinking)阿里巴巴1405.00+/-28427阿里巴巴Apache 2.0
112OpenAIGPT-4o(2025-03-27)OpenAI1404.00+/-85,721OpenAIProprietary
113AnthropicClaude Opus 4Anthropic1403.00+/-112,768AnthropicProprietary
114AnthropicClaude Sonnet 4 (thinking-32k)Anthropic1403.00+/-132,022AnthropicProprietary
115StepFunAIStep 3.5 FlashStepFunAI1403.00+/-122,404StepFunAIProprietary
116MistralAIMistral Large 3MistralAI1402.00+/-112,809MistralAIApache 2.0
117Hunyuan-T1腾讯AI实验室1401.00+/-38236腾讯AI实验室Proprietary
118Amazonamazon-nova-experimental-chat-12-10Amazon1400.00+/-37234AmazonProprietary
119Qwen3.5-35B-A3B阿里巴巴1400.00+/-141,764阿里巴巴Apache 2.0
120ERNIE 5.0百度1400.00+/-34268百度Proprietary
121Qwen3-32B阿里巴巴1399.00+/-30316阿里巴巴Apache 2.0
122MistralAIMagistral-Medium-2506MistralAI1399.00+/-85,827MistralAIProprietary
123Amazonamazon-nova-experimental-chat-11-10Amazon1398.00+/-151,584AmazonProprietary
124Alibabaqwen3-235b-a22b-thinking-2507Alibaba1398.00+/-24489AlibabaApache 2.0
125AnthropicHaiku 4.5Anthropic1398.00+/-95,407AnthropicProprietary
126MiniMaxAIMiniMax M2.5MiniMaxAI1397.00+/-122,436MiniMaxAIModified MIT
127DeepSeek-AIDeepSeek-R1-0528DeepSeek-AI1396.00+/-20869DeepSeek-AIMIT
128DeepSeek-AIDeepSeek-V3.1 TerminusDeepSeek-AI1395.00+/-39218DeepSeek-AIMIT
129Amazonamazon-nova-experimental-chat-10-20Amazon1395.00+/-20806AmazonProprietary
130Alibabaqwen3-235b-a22b-no-thinkingAlibaba1394.00+/-122,392AlibabaApache 2.0
131Qwen3-235B-A22B阿里巴巴1393.00+/-141,604阿里巴巴Apache 2.0
132MiniMaxAIM2.1MiniMaxAI1392.00+/-181,010MiniMaxAIMIT
133GLM-4.5-Air智谱AI1390.00+/-151,540智谱AIMIT
134Nvidianvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia1390.00+/-39194NvidiaNvidia Open
135Qwen3-Next (thinking)阿里巴巴1390.00+/-20828阿里巴巴Apache 2.0
136Moonshot AIKimi K2Moonshot AI1389.00+/-141,695Moonshot AIModified MIT
137OpenAIOpenAI o3-mini (high)OpenAI1388.00+/-18977OpenAIProprietary
138AnthropicClaude Sonnet 4Anthropic1388.00+/-122,472AnthropicProprietary
139OpenAIOpenAI o1OpenAI1386.00+/-104,569OpenAIProprietary
140AnthropicClaude Sonnet 3.7 (thinking-32k)Anthropic1384.00+/-112,793AnthropicProprietary
141trinity-large-thinkingArcee AI1384.00+/-151,617Arcee AIApache 2.0
142intellect-3Prime Intellect1383.00+/-31334Prime IntellectMIT
143OpenAIGPT OSS 120BOpenAI1382.00+/-141,792OpenAIApache 2.0
144OpenAIOpenAI o3-miniOpenAI1382.00+/-84,721OpenAIProprietary
145Qwen3-30B-A3B-2507阿里巴巴1381.00+/-151,426阿里巴巴Apache 2.0
146Nvidiallama-3.1-nemotron-ultra-253b-v1Nvidia1380.00+/-37209NvidiaNvidia Open Model
147mimo-v2-flash (non-thinking)Xiaomi1379.00+/-112,844XiaomiMIT
148Qwen3-Coder-480B-A35B阿里巴巴1377.00+/-151,626阿里巴巴Apache 2.0
149Nvidianvidia-nemotron-3-super-120b-a12bNvidia1375.00+/-25515NvidiaNVIDIA Open Model
150xAIGrok 3xAI1374.00+/-112,677xAIProprietary
151OpenAIGPT-4.1OpenAI1373.00+/-103,226OpenAIProprietary
152mimo-v2-flash (thinking)Xiaomi1373.00+/-22632XiaomiMIT
153MiniMaxminimax-m1MiniMax1372.00+/-131,801MiniMaxApache 2.0
154DeepSeek-AIDeepSeek-V3-0324DeepSeek-AI1370.00+/-103,190DeepSeek-AIMIT
155xAIgrok-3-mini-betaxAI1369.00+/-141,529xAIProprietary
156GLM-4.7-Flash智谱AI1366.00+/-21716智谱AIMIT
157Google Deep MindGemini 2.5 Flash-Lite (thinking)Google Deep Mind1365.00+/-122,094Google Deep MindProprietary
158Google Deep MindGemini 2.5 Flash-Lite-Preview-09-2025 (no-thinking)Google Deep Mind1364.00+/-112,878Google Deep MindProprietary
159Qwen2.5-Max阿里巴巴1364.00+/-103,305阿里巴巴Proprietary
160QwQ-32B阿里巴巴1364.00+/-141,720阿里巴巴Apache 2.0
161StepFunAIStep3StepFunAI1364.00+/-31351StepFunAIApache 2.0
162AnthropicClaude Sonnet 3.7Anthropic1362.00+/-103,358AnthropicProprietary
163OpenAIOpenAI o1-miniOpenAI1362.00+/-87,499OpenAIProprietary
164trinity-large-previewArcee AI1361.00+/-141,891Arcee AIApache 2.0
165GLM-4.5V智谱AI1357.00+/-34277智谱AIMIT
166DeepMindGemini 2.0 Flash ExperimentalDeepMind1356.00+/-94,065DeepMindProprietary
167MiniMaxAIMiniMax M2MiniMaxAI1356.00+/-33319MiniMaxAIApache 2.0
168OpenAIGPT-4.1 miniOpenAI1355.00+/-112,693OpenAIProprietary
169ling-flash-2.0Ant Group1354.00+/-27460Ant GroupMIT
170Nvidianvidia-nemotron-3-nano-30b-a3b-bf16Nvidia1353.00+/-19987NvidiaNVIDIA Open Model
171Qwen3-30B-A3B阿里巴巴1353.00+/-141,707阿里巴巴Apache 2.0
172AnthropicClaude 3.5 SonnetAnthropic1351.00+/-710,017AnthropicProprietary
173Mistralmistral-medium-2505Mistral1349.00+/-122,229MistralProprietary
174Tencenthunyuan-turbos-20250416Tencent1348.00+/-20845TencentProprietary
175OpenAIGPT-5-Nano (high)OpenAI1344.00+/-27493OpenAIProprietary
176AnthropicClaude 3.5 SonnetAnthropic1342.00+/-711,359AnthropicProprietary
177ring-flash-2.0Ant Group1339.00+/-27453Ant GroupMIT
178MistralAIMistral-Small-3.2MistralAI1339.00+/-181,042MistralAIApache 2.0
179Google Deep MindGemini 1.5 ProGoogle Deep Mind1339.00+/-77,610Google Deep MindProprietary
180OpenAIGPT OSS 20BOpenAI1336.00+/-22680OpenAIApache 2.0
181Nova 2 Lite亚马逊1335.00+/-20826亚马逊Proprietary
182DeepMindGemini 2.0 Flash-LiteDeepMind1326.00+/-102,814DeepMindProprietary
183Alibabaqwen-plus-0125Alibaba1324.00+/-19732AlibabaProprietary
184Google Deep MindGemma 3 - 27B (IT)Google Deep Mind1322.00+/-93,581Google Deep MindGemma
185granite-4.1-8bIBM1320.00+/-39236IBMApache 2.0
186Metallama-3.1-405b-instruct-fp8Meta1319.00+/-88,482MetaLlama 3.1 Community
187Llama 4 Maverick InstructFacebook AI研究实验室1318.00+/-112,838Facebook AI研究实验室Llama 4
188Google Deep MindGemma 3 - 12B (IT)Google Deep Mind1317.00+/-27389Google Deep MindGemma
189Metallama-3.1-405b-instruct-bf16Meta1315.00+/-85,215MetaLlama 3.1 Community
190StepFunstep-2-16k-exp-202412StepFun1313.00+/-20642StepFunProprietary
191athene-v2-chatNexusFlow1312.00+/-93,412NexusFlowNexusFlow
192AnthropicClaude3-OpusAnthropic1312.00+/-625,769AnthropicProprietary
193olmo-3-32b-thinkAi21311.00+/-32314Ai2Apache 2.0
194DeepSeek-AIDeepSeek-V3DeepSeek-AI1311.00+/-112,721DeepSeek-AIDeepSeek
195CohereAIC4AI Command A (202503)CohereAI1309.00+/-93,994CohereAICC-BY-NC-4.0
196Llama 4 Scout InstructFacebook AI研究实验室1309.00+/-131,945Facebook AI研究实验室Llama
197OpenAIGPT-4oOpenAI1309.00+/-86,826OpenAIProprietary
198yi-lightning01 AI1306.00+/-103,92101 AIProprietary
199olmo-3.1-32b-instructAi21306.00+/-23696Ai2Apache 2.0
200Googlegemini-advanced-0514Google1305.00+/-106,395GoogleProprietary
201OpenAIGPT-4oOpenAI1305.00+/-715,103OpenAIProprietary
202Alibabaqwen2.5-plus-1127Alibaba1304.00+/-141,404AlibabaProprietary
203OpenAIGPT-4OpenAI1303.00+/-813,306OpenAIProprietary
204Tencenthunyuan-turbos-20250226Tencent1302.00+/-31238TencentProprietary
205OpenAIGPT-4OpenAI1299.00+/-812,374OpenAIProprietary
206StepFunstep-1o-turbo-202506StepFun1299.00+/-24565StepFunProprietary
207glm-4-plus-0111Zhipu1298.00+/-19721ZhipuProprietary
208Google Deep MindGemini 1.5 ProGoogle Deep Mind1298.00+/-810,492Google Deep MindProprietary
209Qwen2.5-VL-72B-Instruct阿里巴巴1297.00+/-85,415阿里巴巴Qwen
210olmo-3.1-32b-thinkAi21297.00+/-26473Ai2Apache 2.0
211OpenAIgpt-4-turbo-2024-04-09OpenAI1296.00+/-813,217OpenAIProprietary
212Llama3.3-70B-InstructFacebook AI研究实验室1296.00+/-85,777Facebook AI研究实验室Llama-3.3
213xAIGrok 2xAI1294.00+/-78,950xAIProprietary
214Tencenthunyuan-large-2025-02-10Tencent1294.00+/-24497TencentProprietary
215DeepSeekdeepseek-v2.5-1210DeepSeek1293.00+/-171,031DeepSeekDeepSeek
216Alibabaqwen-max-0919Alibaba1292.00+/-122,249AlibabaQwen
217Tencenthunyuan-standard-2025-02-10Tencent1290.00+/-24499TencentProprietary
218Googlegemini-1.5-flash-002Google1288.00+/-94,789GoogleProprietary
219Mistralmistral-large-2407Mistral1288.00+/-86,664MistralMistral Research
220DeepSeek-AIDeepSeek V2.5DeepSeek-AI1288.00+/-103,649DeepSeek-AIDeepSeek
221glm-4-plusZhipu AI1287.00+/-103,599Zhipu AIProprietary
222AnthropicClaude 3.5 HaikuAnthropic1286.00+/-76,365AnthropicProprietary
223MistralAIMagistral-Medium-2506MistralAI1286.00+/-26554MistralAIProprietary
224OpenAIGPT-4OpenAI1283.00+/-107,052OpenAIProprietary
225Mistralmistral-large-2411Mistral1282.00+/-93,574MistralMRL
226Tencenthunyuan-large-visionTencent1280.00+/-30351TencentProprietary
227Tencenthunyuan-turbo-0110Tencent1279.00+/-31243TencentProprietary
228ibm-granite-h-smallIBM1279.00+/-32358IBMApache 2.0
229Llama3.1-70B-InstructFacebook AI研究实验室1279.00+/-171,041Facebook AI研究实验室Llama 3.1
230MistralAIMistral-Small-3.1-24B-Instruct-2503MistralAI1278.00+/-132,129MistralAIApache 2.0
231OpenAIGPT-4o miniOpenAI1276.00+/-79,322OpenAIProprietary
232OpenAIGPT-4OpenAI1275.00+/-811,181OpenAIProprietary
233OpenAIGPT-4.1 nanoOpenAI1274.00+/-23582OpenAIProprietary
234Qwen2-72B-Instruct阿里巴巴1273.00+/-94,835阿里巴巴Qianwen LICENSE
235xAIgrok-2-mini-2024-08-13xAI1273.00+/-87,261xAIProprietary
236DeepSeekdeepseek-coder-v2DeepSeek1271.00+/-131,858DeepSeekDeepSeek License
237Nvidiallama-3.1-nemotron-51b-instructNvidia1271.00+/-22507NvidiaLlama 3.1
238Qwen2.5-Coder-32B-Instruct阿里巴巴1270.00+/-19725阿里巴巴Apache 2.0
239Amazonamazon-nova-pro-v1.0Amazon1269.00+/-102,978AmazonProprietary
240Llama3.1-70B-InstructFacebook AI研究实验室1269.00+/-87,677Facebook AI研究实验室Llama 3.1 Community
241Microsoft AzurePhi 4 - 14BMicrosoft Azure1265.00+/-102,764Microsoft AzureMIT
242llama-3.1-tulu-3-70bAi21264.00+/-25397Ai2Llama 3.1
243MistralAIMistral Small 24B Instruct 2501MistralAI1262.00+/-131,683MistralAIApache 2.0
244athene-70b-0725NexusFlow1261.00+/-102,921NexusFlowCC-BY-NC-4.0
245Google Deep MindGemma-3n-E4BGoogle Deep Mind1260.00+/-151,572Google Deep MindGemma
246Llama3-70B-InstructFacebook AI研究实验室1257.00+/-720,941Facebook AI研究实验室Llama 3 Community
247Googlegemini-1.5-flash-001Google1257.00+/-88,392GoogleProprietary
248Google Deep MindGemma 3 - 4B (IT)Google Deep Mind1254.00+/-28423Google Deep MindGemma
249AnthropicClaude3-SonnetAnthropic1253.00+/-813,766AnthropicProprietary
250Nvidianemotron-4-340b-instructNvidia1252.00+/-122,352NvidiaNVIDIA Open Model
251Tencenthunyuan-standard-256kTencent1250.00+/-29361TencentProprietary
252GLM4智谱AI1247.00+/-161,191智谱AIProprietary
253reka-core-20240904Reka AI1246.00+/-141,207Reka AIProprietary
254Googlegemma-2-27b-itGoogle1246.00+/-710,170GoogleGemma license
255jamba-1.5-largeAI21 Labs1245.00+/-151,147AI21 LabsJamba Open
256Amazonamazon-nova-lite-v1.0Amazon1244.00+/-112,511AmazonProprietary
257Mistralmistral-large-2402Mistral1244.00+/-97,987MistralProprietary
258CohereAIC4AI Aya Vision 32BCohereAI1232.00+/-103,854CohereAICC-BY-NC-4.0
259reka-flash-20240904Reka AI1232.00+/-141,284Reka AIProprietary
260AnthropicClaude3-HaikuAnthropic1231.00+/-714,983AnthropicProprietary
261Coherecommand-r-plus-08-2024Cohere1231.00+/-141,467CohereCC-BY-NC-4.0
262Googlegemini-1.5-flash-8b-001Google1229.00+/-85,036GoogleProprietary
263MistralAIMixtral-8x22B-Instruct-v0.1MistralAI1228.00+/-96,778MistralAIApache 2.0
264olmo-2-0325-32b-instructAi21227.00+/-28375Ai2Apache-2.0
265Amazonamazon-nova-micro-v1.0Amazon1224.00+/-112,455AmazonProprietary
266Qwen1.5-110B-Chat阿里巴巴1221.00+/-113,188阿里巴巴Qianwen LICENSE
267Mistralmistral-mediumMistral1220.00+/-114,406MistralProprietary
268Googlegemma-2-9b-itGoogle1218.00+/-87,110GoogleGemma license
269Microsoft AzurePhi-3-medium 14B-previewMicrosoft Azure1215.00+/-113,238Microsoft AzureMIT
270Mistralministral-8b-2410Mistral1214.00+/-20683MistralMRL
271CohereAIC4AI Command R+CohereAI1213.00+/-89,769CohereAICC-BY-NC-4.0
272Yi-1.5-34B零一万物1213.00+/-112,985零一万物Apache-2.0
273QwQ-32B-Preview阿里巴巴1212.00+/-24480阿里巴巴Apache 2.0
274reka-flash-21b-20240226-onlineReka AI1211.00+/-142,028Reka AIProprietary
275Qwen1.5-72B-Chat阿里巴巴1208.00+/-105,327阿里巴巴Qianwen LICENSE
276InternLM2-Base-20B上海人工智能实验室1207.00+/-151,387上海人工智能实验室Other
277llama-3.1-tulu-3-8bAi21206.00+/-26363Ai2Llama 3.1
278Coherecommand-r-08-2024Cohere1206.00+/-141,601CohereCC-BY-NC-4.0
279gemma-2-9b-it-simpoPrinceton1205.00+/-151,285PrincetonMIT
280OpenAIgpt-3.5-turbo-1106OpenAI1203.00+/-152,134OpenAIProprietary
281Alibabaqwen1.5-32b-chatAlibaba1200.00+/-122,649AlibabaQianwen LICENSE
282CohereAIC4AI Aya Vision 8BCohereAI1200.00+/-151,307CohereAICC-BY-NC-4.0
283OpenAIgpt-3.5-turbo-0125OpenAI1200.00+/-88,626OpenAIProprietary
284DeepMindGemini-proDeepMind1199.00+/-19993DeepMindProprietary
285reka-flash-21b-20240226Reka AI1199.00+/-113,363Reka AIProprietary
286granite-3.1-2b-instructIBM1197.00+/-26391IBMApache 2.0
287granite-3.0-8b-instructIBM1197.00+/-19873IBMApache 2.0
288zephyr-orpo-141b-A35b-v0.1HuggingFace1196.00+/-22589HuggingFaceApache 2.0
289Googlegemini-pro-dev-apiGoogle1196.00+/-142,274GoogleProprietary
290DBRX Instructdatabricks1196.00+/-114,001databricksDBRX LICENSE
291Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1193.00+/-141,568Microsoft AzureMIT
292Microsoft AzurePhi-3-small 7BMicrosoft Azure1193.00+/-132,092Microsoft AzureMIT
293Llama3-8B-InstructFacebook AI研究实验室1192.00+/-814,252Facebook AI研究实验室Llama 3 Community
294Mistralmixtral-8x7b-instruct-v0.1Mistral1191.00+/-99,663MistralApache 2.0
295Llama3.1-8B-InstructFacebook AI研究实验室1190.00+/-28382Facebook AI研究实验室Apache 2.0
296Llama3.1-8B-InstructFacebook AI研究实验室1189.00+/-87,135Facebook AI研究实验室Llama 3.1 Community
297jamba-1.5-miniAI21 Labs1186.00+/-161,094AI21 LabsJamba Open
298Coherecommand-rCohere1176.00+/-96,682CohereCC-BY-NC-4.0
299Qwen3-VL-2B阿里巴巴1168.00+/-19908阿里巴巴Apache 2.0
300Qwen1.5-14B-Chat阿里巴巴1167.00+/-142,184阿里巴巴Qianwen LICENSE
301Metallama-3.2-3b-instructMeta1165.00+/-161,136MetaLlama 3.2
302Googlegemma-2-2b-itGoogle1163.00+/-86,599GoogleGemma license
303snowflake-arctic-instructSnowflake1162.00+/-114,793SnowflakeApache 2.0
304Google ResearchGemma 1.1-7B-ITGoogle Research1160.00+/-113,039Google ResearchGemma license
305openchat-3.5-0106OpenChat1158.00+/-141,726OpenChatApache-2.0
306starling-lm-7b-betaNexusflow1158.00+/-141,973NexusflowApache-2.0
307WizardLM-70B-V1.0WizardLM Team1157.00+/-19903WizardLM TeamLlama 2 Community
308DeepSeek-AIDeepSeek LLM 67B ChatDeepSeek-AI1155.00+/-23576DeepSeek-AIDeepSeek License
309smollm2-1.7b-instructHuggingFace1152.00+/-33271HuggingFaceApache 2.0
310openhermes-2.5-mistral-7bNousResearch1151.00+/-20697NousResearchApache-2.0
311Yi-34B零一万物1151.00+/-132,043零一万物Yi License
312Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1150.00+/-122,564Microsoft AzureMIT
313tulu-2-dpo-70bAllenAI/UW1145.00+/-19888AllenAI/UWAI2 ImpACT Low-risk
314Microsoft AzurePhi-3-mini 3.8BMicrosoft Azure1139.00+/-132,813Microsoft AzureMIT
315Metallama-2-70b-chatMeta1136.00+/-104,740MetaLlama 2 Community
316MistralAIMistral-7B-Instruct-v0.2MistralAI1127.00+/-122,605MistralAIApache-2.0
317starling-lm-7b-alphaUC Berkeley1126.00+/-161,300UC BerkeleyCC-BY-NC-4.0
318Qwen-14B-Chat阿里巴巴1125.00+/-24534阿里巴巴Qianwen LICENSE
319dolphin-2.2.1-mistral-7bCognitive Computations1125.00+/-32219Cognitive ComputationsApache-2.0
320openchat-3.5OpenChat1125.00+/-18945OpenChatApache-2.0
321Metallama-3.2-1b-instructMeta1124.00+/-161,162MetaLlama 3.2
322Qwen1.5-7B-Chat阿里巴巴1120.00+/-20690阿里巴巴Qianwen LICENSE
323Google ResearchGemma 7B - ItGoogle Research1118.00+/-161,120Google ResearchGemma license
324Vicuna 33BLM-SYS1115.00+/-132,663LM-SYSNon-commercial
325Google ResearchPaLM 2Google Research1115.00+/-19901Google ResearchProprietary
326Nvidiallama2-70b-steerlm-chatNvidia1114.00+/-27440NvidiaLlama 2 Community
327Baichuan2-13B-Chat百川智能1110.00+/-132,218百川智能Llama 2 Community
328CodeLLaMA-34BFacebook AI研究实验室1109.00+/-19770Facebook AI研究实验室Llama 2 Community
329solar-10.7b-instruct-v1.0Upstage AI1109.00+/-22604Upstage AICC-BY-NC-4.0
330Google ResearchGemma 1.1-2B-ITGoogle Research1108.00+/-161,355Google ResearchGemma license
331MPT-30B-ChatMosaicML1095.00+/-34242MosaicMLCC-BY-NC-SA-4.0
332nous-hermes-2-mixtral-8x7b-dpoNousResearch1093.00+/-21628NousResearchApache-2.0
333Baichuan2-7B-Chat百川智能1086.00+/-141,656百川智能Llama 2 Community
334Qwen1.5-4B-Chat阿里巴巴1086.00+/-18988阿里巴巴Qianwen LICENSE
335stripedhyena-nous-7bTogether AI1084.00+/-20676Together AIApache 2.0
336Vicuna 13BLM-SYS1083.00+/-142,146LM-SYSLlama 2 Community
337zephyr-7b-betaHuggingFace1082.00+/-171,250HuggingFaceMIT
338MistralAIMistral 7B InstructMistralAI1082.00+/-19974MistralAIApache 2.0
339guanaco-33bUW1080.00+/-32280UWNon-commercial
340Google ResearchGemma 2B - ItGoogle Research1070.00+/-22597Google ResearchGemma license
341Microsoftwizardlm-13bMicrosoft1064.00+/-21669MicrosoftLlama 2 Community
342olmo-7b-instructAi21054.00+/-19848Ai2Apache-2.0
343Vicuna 7BLM-SYS1047.00+/-22658LM-SYSLlama 2 Community
344ChatGLM3-6B智谱AI1042.00+/-23576智谱AIApache-2.0
345GPT4All 13BNomic AI998.00+/-37211Nomic AINon-commercial
346alpaca-13bStanford992.00+/-23652StanfordNon-commercial
347MPT-7B-ChatMosaicML985.00+/-25471MosaicMLCC-BY-NC-SA-4.0
348RWKV-4-Raven-14BRWKV983.00+/-24544RWKVApache 2.0
349Koala达摩院980.00+/-21751达摩院Non-commercial
350ChatGLM-6B智谱AI976.00+/-26525智谱AINon-commercial
351ChatGLM2-6B智谱AI971.00+/-35227智谱AIApache-2.0
352oasst-pythia-12bOpenAssistant960.00+/-22687OpenAssistantApache 2.0
353dolly-v2-12bDatabricks950.00+/-29370DatabricksMIT
354fastchat-t5-3bLMSYS919.00+/-26462LMSYSApache 2.0
355LLaMA 13BFacebook AI研究实验室919.00+/-33252Facebook AI研究实验室Non-commercial
356stablelm-tuned-alpha-7bStability AI890.00+/-29353Stability AICC-BY-NC-SA-4.0

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

FAQ

01

What is LMArena Math Arena?

LMArena Math Arena is an anonymous evaluation track focused on mathematical reasoning. Users submit real math questions, compare hidden model solutions side by side, and vote for the better answer; the leaderboard is then calculated with Elo-style scoring.

02

How is Math Arena different from MATH-500 or AIME?

Static benchmarks such as MATH-500 and AIME use fixed problem sets and automated grading. Math Arena uses open-ended user questions and human preference voting, making it a useful complement for measuring how models handle varied real-world math tasks.

03

Do thinking models perform better in Math Arena?

Models with extended reasoning or chain-of-thought style capabilities often rank higher on math tasks because they spend more time decomposing and checking solutions. That benefit can come with higher latency and cost.

04

How do China-developed models perform in math?

DeepSeek, Qwen, GLM, and related models have become competitive in math reasoning leaderboards. Open licenses and Chinese-language support can make them especially useful for local deployment and education scenarios.