LMArena Math Arena Leaderboard

Name: LMArena Math Arena Leaderboard
Creator: DataLearner
License: https://creativecommons.org/licenses/by/4.0/

The latest AI math reasoning leaderboard based on LMArena Math Arena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for Claude, GPT, Gemini, DeepSeek, Qwen, and more.

Top Model

Claude Fable 5

Top Score

1517.00

Model Count

356

Data version

2026年06月16日

Data source: LM Arena

About This Leaderboard

This leaderboard ranks AI models by mathematical reasoning ability. Data comes from LMArena's Math sub-track, evaluated through anonymous blind testing by real users on math problem-solving tasks.

Methodology Overview

Blind testing: Users submit math problems, two anonymous models provide solutions, and users vote for the better answer — eliminating brand bias.

Elo scoring: Uses the Bradley-Terry model to calculate Elo scores. Higher scores mean users more frequently prefer that model's math solutions.

Broad scenario coverage: Testing spans algebra, geometry, calculus, competition math, and more diverse real-world math tasks.

DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.

Origin:All China

Leaderboard snapshot month:

Ranking Table

Rank	Model	Score	95% CI	Votes	Organization	License
	Claude Fable 5Anthropic	1517.00	+/-37	244	Anthropic	Proprietary
	Gemini 3.5 FlashGoogle Deep Mind	1516.00	+/-25	584	Google Deep Mind	Proprietary
	Claude Opus 4.6 (thinking)Anthropic	1516.00	+/-12	2,502	Anthropic	Proprietary
4	Claude Opus 4.6Anthropic	1504.00	+/-12	2,867	Anthropic	Proprietary
5	Opus 4.7 (thinking)Anthropic	1504.00	+/-14	1,779	Anthropic	Proprietary
6	GPT-5.4 (high)OpenAI	1503.00	+/-13	2,285	OpenAI	Proprietary
7	Opus 4.7Anthropic	1498.00	+/-14	1,836	Anthropic	Proprietary
8	Claude Opus 4.8 (thinking)Anthropic	1496.00	+/-22	648	Anthropic	Proprietary
9	Claude Opus 4.8Anthropic	1495.00	+/-23	648	Anthropic	Proprietary
10	Gemini 3.1 Pro PreviewGoogle Deep Mind	1495.00	+/-11	3,429	Google Deep Mind	Proprietary
11	GPT-5.5 (high)OpenAI	1494.00	+/-15	1,569	OpenAI	Proprietary
12	Qwen3.7-Max-Preview阿里巴巴	1492.00	+/-40	219	阿里巴巴	Proprietary
13	GPT-5.5OpenAI	1490.00	+/-15	1,574	OpenAI	Proprietary
14	mimo-v2.5-proXiaomi	1485.00	+/-16	1,384	Xiaomi	MIT
15	Kimi K2.6Moonshot AI	1483.00	+/-16	1,372	Moonshot AI	Modified MIT
16	ERNIE-5.1-Preview百度	1481.00	+/-16	1,346	百度	Proprietary
17	Gemini 3.0 Pro (Preview 11-2025)Google Deep Mind	1478.00	+/-11	2,652	Google Deep Mind	Proprietary
18	DeepSeek-V4-Pro (thinking)DeepSeek-AI	1477.00	+/-16	1,391	DeepSeek-AI	MIT
19	Qwen3.6-Max-Preview阿里巴巴	1476.00	+/-30	358	阿里巴巴	Proprietary
20	Gemini 3.0 FlashGoogle Deep Mind	1476.00	+/-13	2,002	Google Deep Mind	Proprietary
21	GLM 5.1智谱AI	1474.00	+/-19	966	智谱AI	MIT
22	Kimi K2 ThinkingMoonshot AI	1472.00	+/-11	2,818	Moonshot AI	Modified MIT
23	grok-4.20-beta-0309-reasoningxAI	1470.00	+/-13	2,399	xAI	Proprietary
24	Claude Opus 4 (thinking-32k)Anthropic	1470.00	+/-12	2,266	Anthropic	Proprietary
25	Qwen3.5 Max Preview阿里巴巴	1469.00	+/-16	1,352	阿里巴巴	Proprietary
26	Gemma 4 31BDeepMind	1469.00	+/-28	399	DeepMind	Apache 2.0
27	Gemma 4 26B A4BDeepMind	1467.00	+/-28	372	DeepMind	Apache 2.0
28	Claude Opus 4Anthropic	1466.00	+/-9	4,343	Anthropic	Proprietary
29	GPT-5.5 InstantOpenAI	1463.00	+/-16	1,472	OpenAI	Proprietary
30	Muse SparkFacebook AI研究实验室	1463.00	+/-20	862	Facebook AI研究实验室	Proprietary
31	minimax-m3MiniMax	1461.00	+/-26	556	MiniMax	Proprietary
32	GPT-5.4OpenAI	1460.00	+/-12	2,433	OpenAI	Proprietary
33	Claude Sonnet 4.6Anthropic	1459.00	+/-13	2,296	Anthropic	Proprietary
34	GPT-5.2 Pro (high)OpenAI	1458.00	+/-11	2,990	OpenAI	Proprietary
35	Claude Sonnet 4.5 (thinking-32k)Anthropic	1455.00	+/-9	4,913	Anthropic	Proprietary
36	Gemini 3.0 Flash (minimal)Google Deep Mind	1455.00	+/-10	3,814	Google Deep Mind	Proprietary
37	GPT-5.1 Pro (high)OpenAI	1455.00	+/-12	2,500	OpenAI	Proprietary
38	GPT-5.2OpenAI	1453.00	+/-13	2,084	OpenAI	Proprietary
39	Qwen 3.6 Plus Preview阿里巴巴	1453.00	+/-14	1,720	阿里巴巴	Proprietary
40	mimo-v2-proXiaomi	1452.00	+/-15	1,632	Xiaomi	Proprietary
41	grok-4.20-multi-agent-beta-0309xAI	1451.00	+/-13	2,367	xAI	Proprietary
42	Grok 4.20 BetaxAI	1451.00	+/-15	1,609	xAI	Proprietary
43	DOLA Seed 2.0 Pro字节跳动Seed团队	1449.00	+/-11	2,913	字节跳动Seed团队	Proprietary
44	mimo-v2.5Xiaomi	1448.00	+/-15	1,467	Xiaomi	MIT
45	OpenAI o3OpenAI	1447.00	+/-10	3,728	OpenAI	Proprietary
46	Qwen3.5-397B-A17B阿里巴巴	1447.00	+/-12	2,614	阿里巴巴	Apache 2.0
47	nvidia-nemotron-3-ultra-550b-a55b-nvfp4Nvidia	1445.00	+/-31	347	Nvidia	OpenMDW-1.1
48	Opus 4.1 (thinking-16k)Anthropic	1444.00	+/-11	3,025	Anthropic	Proprietary
49	mimo-v2-omniXiaomi	1443.00	+/-25	598	Xiaomi	Proprietary
50	Grok 4.1 ThinkingxAI	1443.00	+/-10	3,833	xAI	Proprietary
51	Kimi K2.5 InstantMoonshot AI	1442.00	+/-25	513	Moonshot AI	Modified MIT
52	Gemini 2.5 Pro Experimental 03-25Google Deep Mind	1442.00	+/-7	7,644	Google Deep Mind	Proprietary
53	gemini-3.1-flash-lite-previewGoogle	1442.00	+/-11	2,855	Google	Proprietary
54	GPT-5.4 mini (high)OpenAI	1441.00	+/-13	2,233	OpenAI	Proprietary
55	GLM-5智谱AI	1440.00	+/-15	1,406	智谱AI	MIT
56	Qwen3 Max (Preview)阿里巴巴	1439.00	+/-15	1,525	阿里巴巴	Proprietary
57	Kimi K2 Thinking (thinking-turbo)Moonshot AI	1438.00	+/-10	3,785	Moonshot AI	Modified MIT
58	DeepSeek-V4-ProDeepSeek-AI	1437.00	+/-15	1,651	DeepSeek-AI	MIT
59	ERNIE 5.0百度	1437.00	+/-13	2,150	百度	Proprietary
60	DeepSeek-V4-Flash (thinking)DeepSeek-AI	1436.00	+/-16	1,511	DeepSeek-AI	MIT
61	longcat-flash-chat-2602-expMeituan	1436.00	+/-14	1,753	Meituan	Proprietary
62	GPT-5-Pro (high)OpenAI	1434.00	+/-14	1,887	OpenAI	Proprietary
63	GPT-5.2OpenAI	1433.00	+/-10	3,461	OpenAI	Proprietary
64	Opus 4.1Anthropic	1433.00	+/-9	4,724	Anthropic	Proprietary
65	mistral-medium-3.5Mistral	1433.00	+/-25	519	Mistral	Modified MIT
66	GPT-5.4 nano (high)OpenAI	1432.00	+/-13	2,079	OpenAI	Proprietary
67	qwen3-max-2025-09-23Alibaba	1430.00	+/-24	582	Alibaba	Proprietary
68	DeepSeek V3.2DeepSeek-AI	1430.00	+/-11	3,004	DeepSeek-AI	MIT
69	Qwen3.5-27B阿里巴巴	1429.00	+/-15	1,653	阿里巴巴	Apache 2.0
70	Grok 4.1xAI	1429.00	+/-9	4,235	xAI	Proprietary
71	hunyuan-hy3-previewTencent	1429.00	+/-28	405	Tencent	tencent-hunyuan-community
72	GLM-4.7智谱AI	1428.00	+/-21	710	智谱AI	MIT
73	Claude Sonnet 4.5Anthropic	1428.00	+/-9	4,913	Anthropic	Proprietary
74	Grok 4xAI	1428.00	+/-12	2,263	xAI	Proprietary
75	DeepSeek V3.2-Exp (thinking)DeepSeek-AI	1428.00	+/-26	481	DeepSeek-AI	MIT
76	amazon-nova-experimental-chat-26-02-10Amazon	1428.00	+/-39	207	Amazon	Proprietary
77	DeepSeek-V4-FlashDeepSeek-AI	1427.00	+/-15	1,523	DeepSeek-AI	MIT
78	DeepSeek V3.2 (thinking)DeepSeek-AI	1426.00	+/-12	2,506	DeepSeek-AI	MIT
79	GPT-5.3OpenAI	1425.00	+/-13	2,046	OpenAI	Proprietary
80	Qwen3.5-122B-A10B阿里巴巴	1424.00	+/-14	1,779	阿里巴巴	Apache 2.0
81	GPT-5.1 InstantOpenAI	1424.00	+/-11	2,866	OpenAI	Proprietary
82	Grok 4 FastxAI	1423.00	+/-29	398	xAI	Proprietary
83	GLM-4.6智谱AI	1421.00	+/-13	2,107	智谱AI	MIT
84	Claude Opus 4 (thinking-16k)Anthropic	1420.00	+/-12	2,239	Anthropic	Proprietary
85	Qwen3-235B-A22B-2507阿里巴巴	1420.00	+/-8	5,924	阿里巴巴	Apache 2.0
86	Qwen3-Next阿里巴巴	1419.00	+/-17	1,211	阿里巴巴	Apache 2.0
87	Grok 4.3 BetaxAI	1418.00	+/-16	1,454	xAI	Proprietary
88	DeepSeek V3.2-ExpDeepSeek-AI	1418.00	+/-21	775	DeepSeek-AI	MIT
89	Grok 4.1 Fast (fast-reasoning)xAI	1417.00	+/-10	3,500	xAI	Proprietary
90	longcat-flash-chatMeituan	1417.00	+/-22	689	Meituan	MIT
91	Kimi K2 0905Moonshot AI	1416.00	+/-21	759	Moonshot AI	Modified MIT
92	OpenAI o4 - miniOpenAI	1415.00	+/-11	2,939	OpenAI	Proprietary
93	DeepSeek-V3.1DeepSeek-AI	1415.00	+/-18	992	DeepSeek-AI	MIT
94	MiniMax-M2.7MiniMaxAI	1415.00	+/-14	1,953	MiniMaxAI	Modified MIT
95	DeepSeek-V3.1 (thinking)DeepSeek-AI	1415.00	+/-22	663	DeepSeek-AI	MIT
96	GLM-4.5智谱AI	1413.00	+/-15	1,425	智谱AI	MIT
97	GPT-5OpenAI	1413.00	+/-14	1,785	OpenAI	Proprietary
98	Gemini 2.5 Flash-Preview-09-2025Google Deep Mind	1412.00	+/-13	1,944	Google Deep Mind	Proprietary
99	Grok 4 Fast (fast-reasoning)xAI	1412.00	+/-18	1,084	xAI	Proprietary
100	DeepSeek-R1DeepSeek-AI	1411.00	+/-14	1,606	DeepSeek-AI	MIT
101	Qwen3-VL-235B-A22B-Instruct阿里巴巴	1411.00	+/-23	704	阿里巴巴	Apache 2.0
102	amazon-nova-experimental-chat-26-01-10Amazon	1409.00	+/-33	263	Amazon	Proprietary
103	GPT-4.5OpenAI	1409.00	+/-15	1,393	OpenAI	Proprietary
104	OpenAI o1OpenAI	1409.00	+/-11	2,986	OpenAI	Proprietary
105	Step 3.5 FlashStepFunAI	1408.00	+/-12	2,641	StepFunAI	Apache 2.0
106	ERNIE 5.0百度	1408.00	+/-23	618	百度	Proprietary
107	DeepSeek-V3.1 Terminus (thinking)DeepSeek-AI	1407.00	+/-41	197	DeepSeek-AI	MIT
108	Gemini 2.5 FlashGoogle Deep Mind	1406.00	+/-7	7,879	Google Deep Mind	Proprietary
109	OpenAI o3-mini (high)OpenAI	1406.00	+/-13	1,909	OpenAI	Proprietary
110	GPT-5-mini (high)OpenAI	1405.00	+/-15	1,459	OpenAI	Proprietary
111	Qwen3-VL-235B-A22B-Instruct (thinking)阿里巴巴	1405.00	+/-28	427	阿里巴巴	Apache 2.0
112	GPT-4o(2025-03-27)OpenAI	1404.00	+/-8	5,721	OpenAI	Proprietary
113	Claude Opus 4Anthropic	1403.00	+/-11	2,768	Anthropic	Proprietary
114	Claude Sonnet 4 (thinking-32k)Anthropic	1403.00	+/-13	2,022	Anthropic	Proprietary
115	Step 3.5 FlashStepFunAI	1403.00	+/-12	2,404	StepFunAI	Proprietary
116	Mistral Large 3MistralAI	1402.00	+/-11	2,809	MistralAI	Apache 2.0
117	Hunyuan-T1腾讯AI实验室	1401.00	+/-38	236	腾讯AI实验室	Proprietary
118	amazon-nova-experimental-chat-12-10Amazon	1400.00	+/-37	234	Amazon	Proprietary
119	Qwen3.5-35B-A3B阿里巴巴	1400.00	+/-14	1,764	阿里巴巴	Apache 2.0
120	ERNIE 5.0百度	1400.00	+/-34	268	百度	Proprietary
121	Qwen3-32B阿里巴巴	1399.00	+/-30	316	阿里巴巴	Apache 2.0
122	Magistral-Medium-2506MistralAI	1399.00	+/-8	5,827	MistralAI	Proprietary
123	amazon-nova-experimental-chat-11-10Amazon	1398.00	+/-15	1,584	Amazon	Proprietary
124	qwen3-235b-a22b-thinking-2507Alibaba	1398.00	+/-24	489	Alibaba	Apache 2.0
125	Haiku 4.5Anthropic	1398.00	+/-9	5,407	Anthropic	Proprietary
126	MiniMax M2.5MiniMaxAI	1397.00	+/-12	2,436	MiniMaxAI	Modified MIT
127	DeepSeek-R1-0528DeepSeek-AI	1396.00	+/-20	869	DeepSeek-AI	MIT
128	DeepSeek-V3.1 TerminusDeepSeek-AI	1395.00	+/-39	218	DeepSeek-AI	MIT
129	amazon-nova-experimental-chat-10-20Amazon	1395.00	+/-20	806	Amazon	Proprietary
130	qwen3-235b-a22b-no-thinkingAlibaba	1394.00	+/-12	2,392	Alibaba	Apache 2.0
131	Qwen3-235B-A22B阿里巴巴	1393.00	+/-14	1,604	阿里巴巴	Apache 2.0
132	M2.1MiniMaxAI	1392.00	+/-18	1,010	MiniMaxAI	MIT
133	GLM-4.5-Air智谱AI	1390.00	+/-15	1,540	智谱AI	MIT
134	nvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia	1390.00	+/-39	194	Nvidia	Nvidia Open
135	Qwen3-Next (thinking)阿里巴巴	1390.00	+/-20	828	阿里巴巴	Apache 2.0
136	Kimi K2Moonshot AI	1389.00	+/-14	1,695	Moonshot AI	Modified MIT
137	OpenAI o3-mini (high)OpenAI	1388.00	+/-18	977	OpenAI	Proprietary
138	Claude Sonnet 4Anthropic	1388.00	+/-12	2,472	Anthropic	Proprietary
139	OpenAI o1OpenAI	1386.00	+/-10	4,569	OpenAI	Proprietary
140	Claude Sonnet 3.7 (thinking-32k)Anthropic	1384.00	+/-11	2,793	Anthropic	Proprietary
141	trinity-large-thinkingArcee AI	1384.00	+/-15	1,617	Arcee AI	Apache 2.0
142	intellect-3Prime Intellect	1383.00	+/-31	334	Prime Intellect	MIT
143	GPT OSS 120BOpenAI	1382.00	+/-14	1,792	OpenAI	Apache 2.0
144	OpenAI o3-miniOpenAI	1382.00	+/-8	4,721	OpenAI	Proprietary
145	Qwen3-30B-A3B-2507阿里巴巴	1381.00	+/-15	1,426	阿里巴巴	Apache 2.0
146	llama-3.1-nemotron-ultra-253b-v1Nvidia	1380.00	+/-37	209	Nvidia	Nvidia Open Model
147	mimo-v2-flash (non-thinking)Xiaomi	1379.00	+/-11	2,844	Xiaomi	MIT
148	Qwen3-Coder-480B-A35B阿里巴巴	1377.00	+/-15	1,626	阿里巴巴	Apache 2.0
149	nvidia-nemotron-3-super-120b-a12bNvidia	1375.00	+/-25	515	Nvidia	NVIDIA Open Model
150	Grok 3xAI	1374.00	+/-11	2,677	xAI	Proprietary
151	GPT-4.1OpenAI	1373.00	+/-10	3,226	OpenAI	Proprietary
152	mimo-v2-flash (thinking)Xiaomi	1373.00	+/-22	632	Xiaomi	MIT
153	minimax-m1MiniMax	1372.00	+/-13	1,801	MiniMax	Apache 2.0
154	DeepSeek-V3-0324DeepSeek-AI	1370.00	+/-10	3,190	DeepSeek-AI	MIT
155	grok-3-mini-betaxAI	1369.00	+/-14	1,529	xAI	Proprietary
156	GLM-4.7-Flash智谱AI	1366.00	+/-21	716	智谱AI	MIT
157	Gemini 2.5 Flash-Lite (thinking)Google Deep Mind	1365.00	+/-12	2,094	Google Deep Mind	Proprietary
158	Gemini 2.5 Flash-Lite-Preview-09-2025 (no-thinking)Google Deep Mind	1364.00	+/-11	2,878	Google Deep Mind	Proprietary
159	Qwen2.5-Max阿里巴巴	1364.00	+/-10	3,305	阿里巴巴	Proprietary
160	QwQ-32B阿里巴巴	1364.00	+/-14	1,720	阿里巴巴	Apache 2.0
161	Step3StepFunAI	1364.00	+/-31	351	StepFunAI	Apache 2.0
162	Claude Sonnet 3.7Anthropic	1362.00	+/-10	3,358	Anthropic	Proprietary
163	OpenAI o1-miniOpenAI	1362.00	+/-8	7,499	OpenAI	Proprietary
164	trinity-large-previewArcee AI	1361.00	+/-14	1,891	Arcee AI	Apache 2.0
165	GLM-4.5V智谱AI	1357.00	+/-34	277	智谱AI	MIT
166	Gemini 2.0 Flash ExperimentalDeepMind	1356.00	+/-9	4,065	DeepMind	Proprietary
167	MiniMax M2MiniMaxAI	1356.00	+/-33	319	MiniMaxAI	Apache 2.0
168	GPT-4.1 miniOpenAI	1355.00	+/-11	2,693	OpenAI	Proprietary
169	ling-flash-2.0Ant Group	1354.00	+/-27	460	Ant Group	MIT
170	nvidia-nemotron-3-nano-30b-a3b-bf16Nvidia	1353.00	+/-19	987	Nvidia	NVIDIA Open Model
171	Qwen3-30B-A3B阿里巴巴	1353.00	+/-14	1,707	阿里巴巴	Apache 2.0
172	Claude 3.5 SonnetAnthropic	1351.00	+/-7	10,017	Anthropic	Proprietary
173	mistral-medium-2505Mistral	1349.00	+/-12	2,229	Mistral	Proprietary
174	hunyuan-turbos-20250416Tencent	1348.00	+/-20	845	Tencent	Proprietary
175	GPT-5-Nano (high)OpenAI	1344.00	+/-27	493	OpenAI	Proprietary
176	Claude 3.5 SonnetAnthropic	1342.00	+/-7	11,359	Anthropic	Proprietary
177	ring-flash-2.0Ant Group	1339.00	+/-27	453	Ant Group	MIT
178	Mistral-Small-3.2MistralAI	1339.00	+/-18	1,042	MistralAI	Apache 2.0
179	Gemini 1.5 ProGoogle Deep Mind	1339.00	+/-7	7,610	Google Deep Mind	Proprietary
180	GPT OSS 20BOpenAI	1336.00	+/-22	680	OpenAI	Apache 2.0
181	Nova 2 Lite亚马逊	1335.00	+/-20	826	亚马逊	Proprietary
182	Gemini 2.0 Flash-LiteDeepMind	1326.00	+/-10	2,814	DeepMind	Proprietary
183	qwen-plus-0125Alibaba	1324.00	+/-19	732	Alibaba	Proprietary
184	Gemma 3 - 27B (IT)Google Deep Mind	1322.00	+/-9	3,581	Google Deep Mind	Gemma
185	granite-4.1-8bIBM	1320.00	+/-39	236	IBM	Apache 2.0
186	llama-3.1-405b-instruct-fp8Meta	1319.00	+/-8	8,482	Meta	Llama 3.1 Community
187	Llama 4 Maverick InstructFacebook AI研究实验室	1318.00	+/-11	2,838	Facebook AI研究实验室	Llama 4
188	Gemma 3 - 12B (IT)Google Deep Mind	1317.00	+/-27	389	Google Deep Mind	Gemma
189	llama-3.1-405b-instruct-bf16Meta	1315.00	+/-8	5,215	Meta	Llama 3.1 Community
190	step-2-16k-exp-202412StepFun	1313.00	+/-20	642	StepFun	Proprietary
191	athene-v2-chatNexusFlow	1312.00	+/-9	3,412	NexusFlow	NexusFlow
192	Claude3-OpusAnthropic	1312.00	+/-6	25,769	Anthropic	Proprietary
193	olmo-3-32b-thinkAi2	1311.00	+/-32	314	Ai2	Apache 2.0
194	DeepSeek-V3DeepSeek-AI	1311.00	+/-11	2,721	DeepSeek-AI	DeepSeek
195	C4AI Command A (202503)CohereAI	1309.00	+/-9	3,994	CohereAI	CC-BY-NC-4.0
196	Llama 4 Scout InstructFacebook AI研究实验室	1309.00	+/-13	1,945	Facebook AI研究实验室	Llama
197	GPT-4oOpenAI	1309.00	+/-8	6,826	OpenAI	Proprietary
198	yi-lightning01 AI	1306.00	+/-10	3,921	01 AI	Proprietary
199	olmo-3.1-32b-instructAi2	1306.00	+/-23	696	Ai2	Apache 2.0
200	gemini-advanced-0514Google	1305.00	+/-10	6,395	Google	Proprietary
201	GPT-4oOpenAI	1305.00	+/-7	15,103	OpenAI	Proprietary
202	qwen2.5-plus-1127Alibaba	1304.00	+/-14	1,404	Alibaba	Proprietary
203	GPT-4OpenAI	1303.00	+/-8	13,306	OpenAI	Proprietary
204	hunyuan-turbos-20250226Tencent	1302.00	+/-31	238	Tencent	Proprietary
205	GPT-4OpenAI	1299.00	+/-8	12,374	OpenAI	Proprietary
206	step-1o-turbo-202506StepFun	1299.00	+/-24	565	StepFun	Proprietary
207	glm-4-plus-0111Zhipu	1298.00	+/-19	721	Zhipu	Proprietary
208	Gemini 1.5 ProGoogle Deep Mind	1298.00	+/-8	10,492	Google Deep Mind	Proprietary
209	Qwen2.5-VL-72B-Instruct阿里巴巴	1297.00	+/-8	5,415	阿里巴巴	Qwen
210	olmo-3.1-32b-thinkAi2	1297.00	+/-26	473	Ai2	Apache 2.0
211	gpt-4-turbo-2024-04-09OpenAI	1296.00	+/-8	13,217	OpenAI	Proprietary
212	Llama3.3-70B-InstructFacebook AI研究实验室	1296.00	+/-8	5,777	Facebook AI研究实验室	Llama-3.3
213	Grok 2xAI	1294.00	+/-7	8,950	xAI	Proprietary
214	hunyuan-large-2025-02-10Tencent	1294.00	+/-24	497	Tencent	Proprietary
215	deepseek-v2.5-1210DeepSeek	1293.00	+/-17	1,031	DeepSeek	DeepSeek
216	qwen-max-0919Alibaba	1292.00	+/-12	2,249	Alibaba	Qwen
217	hunyuan-standard-2025-02-10Tencent	1290.00	+/-24	499	Tencent	Proprietary
218	gemini-1.5-flash-002Google	1288.00	+/-9	4,789	Google	Proprietary
219	mistral-large-2407Mistral	1288.00	+/-8	6,664	Mistral	Mistral Research
220	DeepSeek V2.5DeepSeek-AI	1288.00	+/-10	3,649	DeepSeek-AI	DeepSeek
221	glm-4-plusZhipu AI	1287.00	+/-10	3,599	Zhipu AI	Proprietary
222	Claude 3.5 HaikuAnthropic	1286.00	+/-7	6,365	Anthropic	Proprietary
223	Magistral-Medium-2506MistralAI	1286.00	+/-26	554	MistralAI	Proprietary
224	GPT-4OpenAI	1283.00	+/-10	7,052	OpenAI	Proprietary
225	mistral-large-2411Mistral	1282.00	+/-9	3,574	Mistral	MRL
226	hunyuan-large-visionTencent	1280.00	+/-30	351	Tencent	Proprietary
227	hunyuan-turbo-0110Tencent	1279.00	+/-31	243	Tencent	Proprietary
228	ibm-granite-h-smallIBM	1279.00	+/-32	358	IBM	Apache 2.0
229	Llama3.1-70B-InstructFacebook AI研究实验室	1279.00	+/-17	1,041	Facebook AI研究实验室	Llama 3.1
230	Mistral-Small-3.1-24B-Instruct-2503MistralAI	1278.00	+/-13	2,129	MistralAI	Apache 2.0
231	GPT-4o miniOpenAI	1276.00	+/-7	9,322	OpenAI	Proprietary
232	GPT-4OpenAI	1275.00	+/-8	11,181	OpenAI	Proprietary
233	GPT-4.1 nanoOpenAI	1274.00	+/-23	582	OpenAI	Proprietary
234	Qwen2-72B-Instruct阿里巴巴	1273.00	+/-9	4,835	阿里巴巴	Qianwen LICENSE
235	grok-2-mini-2024-08-13xAI	1273.00	+/-8	7,261	xAI	Proprietary
236	deepseek-coder-v2DeepSeek	1271.00	+/-13	1,858	DeepSeek	DeepSeek License
237	llama-3.1-nemotron-51b-instructNvidia	1271.00	+/-22	507	Nvidia	Llama 3.1
238	Qwen2.5-Coder-32B-Instruct阿里巴巴	1270.00	+/-19	725	阿里巴巴	Apache 2.0
239	amazon-nova-pro-v1.0Amazon	1269.00	+/-10	2,978	Amazon	Proprietary
240	Llama3.1-70B-InstructFacebook AI研究实验室	1269.00	+/-8	7,677	Facebook AI研究实验室	Llama 3.1 Community
241	Phi 4 - 14BMicrosoft Azure	1265.00	+/-10	2,764	Microsoft Azure	MIT
242	llama-3.1-tulu-3-70bAi2	1264.00	+/-25	397	Ai2	Llama 3.1
243	Mistral Small 24B Instruct 2501MistralAI	1262.00	+/-13	1,683	MistralAI	Apache 2.0
244	athene-70b-0725NexusFlow	1261.00	+/-10	2,921	NexusFlow	CC-BY-NC-4.0
245	Gemma-3n-E4BGoogle Deep Mind	1260.00	+/-15	1,572	Google Deep Mind	Gemma
246	Llama3-70B-InstructFacebook AI研究实验室	1257.00	+/-7	20,941	Facebook AI研究实验室	Llama 3 Community
247	gemini-1.5-flash-001Google	1257.00	+/-8	8,392	Google	Proprietary
248	Gemma 3 - 4B (IT)Google Deep Mind	1254.00	+/-28	423	Google Deep Mind	Gemma
249	Claude3-SonnetAnthropic	1253.00	+/-8	13,766	Anthropic	Proprietary
250	nemotron-4-340b-instructNvidia	1252.00	+/-12	2,352	Nvidia	NVIDIA Open Model
251	hunyuan-standard-256kTencent	1250.00	+/-29	361	Tencent	Proprietary
252	GLM4智谱AI	1247.00	+/-16	1,191	智谱AI	Proprietary
253	reka-core-20240904Reka AI	1246.00	+/-14	1,207	Reka AI	Proprietary
254	gemma-2-27b-itGoogle	1246.00	+/-7	10,170	Google	Gemma license
255	jamba-1.5-largeAI21 Labs	1245.00	+/-15	1,147	AI21 Labs	Jamba Open
256	amazon-nova-lite-v1.0Amazon	1244.00	+/-11	2,511	Amazon	Proprietary
257	mistral-large-2402Mistral	1244.00	+/-9	7,987	Mistral	Proprietary
258	C4AI Aya Vision 32BCohereAI	1232.00	+/-10	3,854	CohereAI	CC-BY-NC-4.0
259	reka-flash-20240904Reka AI	1232.00	+/-14	1,284	Reka AI	Proprietary
260	Claude3-HaikuAnthropic	1231.00	+/-7	14,983	Anthropic	Proprietary
261	command-r-plus-08-2024Cohere	1231.00	+/-14	1,467	Cohere	CC-BY-NC-4.0
262	gemini-1.5-flash-8b-001Google	1229.00	+/-8	5,036	Google	Proprietary
263	Mixtral-8x22B-Instruct-v0.1MistralAI	1228.00	+/-9	6,778	MistralAI	Apache 2.0
264	olmo-2-0325-32b-instructAi2	1227.00	+/-28	375	Ai2	Apache-2.0
265	amazon-nova-micro-v1.0Amazon	1224.00	+/-11	2,455	Amazon	Proprietary
266	Qwen1.5-110B-Chat阿里巴巴	1221.00	+/-11	3,188	阿里巴巴	Qianwen LICENSE
267	mistral-mediumMistral	1220.00	+/-11	4,406	Mistral	Proprietary
268	gemma-2-9b-itGoogle	1218.00	+/-8	7,110	Google	Gemma license
269	Phi-3-medium 14B-previewMicrosoft Azure	1215.00	+/-11	3,238	Microsoft Azure	MIT
270	ministral-8b-2410Mistral	1214.00	+/-20	683	Mistral	MRL
271	C4AI Command R+CohereAI	1213.00	+/-8	9,769	CohereAI	CC-BY-NC-4.0
272	Yi-1.5-34B零一万物	1213.00	+/-11	2,985	零一万物	Apache-2.0
273	QwQ-32B-Preview阿里巴巴	1212.00	+/-24	480	阿里巴巴	Apache 2.0
274	reka-flash-21b-20240226-onlineReka AI	1211.00	+/-14	2,028	Reka AI	Proprietary
275	Qwen1.5-72B-Chat阿里巴巴	1208.00	+/-10	5,327	阿里巴巴	Qianwen LICENSE
276	InternLM2-Base-20B上海人工智能实验室	1207.00	+/-15	1,387	上海人工智能实验室	Other
277	llama-3.1-tulu-3-8bAi2	1206.00	+/-26	363	Ai2	Llama 3.1
278	command-r-08-2024Cohere	1206.00	+/-14	1,601	Cohere	CC-BY-NC-4.0
279	gemma-2-9b-it-simpoPrinceton	1205.00	+/-15	1,285	Princeton	MIT
280	gpt-3.5-turbo-1106OpenAI	1203.00	+/-15	2,134	OpenAI	Proprietary
281	qwen1.5-32b-chatAlibaba	1200.00	+/-12	2,649	Alibaba	Qianwen LICENSE
282	C4AI Aya Vision 8BCohereAI	1200.00	+/-15	1,307	CohereAI	CC-BY-NC-4.0
283	gpt-3.5-turbo-0125OpenAI	1200.00	+/-8	8,626	OpenAI	Proprietary
284	Gemini-proDeepMind	1199.00	+/-19	993	DeepMind	Proprietary
285	reka-flash-21b-20240226Reka AI	1199.00	+/-11	3,363	Reka AI	Proprietary
286	granite-3.1-2b-instructIBM	1197.00	+/-26	391	IBM	Apache 2.0
287	granite-3.0-8b-instructIBM	1197.00	+/-19	873	IBM	Apache 2.0
288	zephyr-orpo-141b-A35b-v0.1HuggingFace	1196.00	+/-22	589	HuggingFace	Apache 2.0
289	gemini-pro-dev-apiGoogle	1196.00	+/-14	2,274	Google	Proprietary
290	DBRX Instructdatabricks	1196.00	+/-11	4,001	databricks	DBRX LICENSE
291	Phi-3-mini 3.8BMicrosoft Azure	1193.00	+/-14	1,568	Microsoft Azure	MIT
292	Phi-3-small 7BMicrosoft Azure	1193.00	+/-13	2,092	Microsoft Azure	MIT
293	Llama3-8B-InstructFacebook AI研究实验室	1192.00	+/-8	14,252	Facebook AI研究实验室	Llama 3 Community
294	mixtral-8x7b-instruct-v0.1Mistral	1191.00	+/-9	9,663	Mistral	Apache 2.0
295	Llama3.1-8B-InstructFacebook AI研究实验室	1190.00	+/-28	382	Facebook AI研究实验室	Apache 2.0
296	Llama3.1-8B-InstructFacebook AI研究实验室	1189.00	+/-8	7,135	Facebook AI研究实验室	Llama 3.1 Community
297	jamba-1.5-miniAI21 Labs	1186.00	+/-16	1,094	AI21 Labs	Jamba Open
298	command-rCohere	1176.00	+/-9	6,682	Cohere	CC-BY-NC-4.0
299	Qwen3-VL-2B阿里巴巴	1168.00	+/-19	908	阿里巴巴	Apache 2.0
300	Qwen1.5-14B-Chat阿里巴巴	1167.00	+/-14	2,184	阿里巴巴	Qianwen LICENSE
301	llama-3.2-3b-instructMeta	1165.00	+/-16	1,136	Meta	Llama 3.2
302	gemma-2-2b-itGoogle	1163.00	+/-8	6,599	Google	Gemma license
303	snowflake-arctic-instructSnowflake	1162.00	+/-11	4,793	Snowflake	Apache 2.0
304	Gemma 1.1-7B-ITGoogle Research	1160.00	+/-11	3,039	Google Research	Gemma license
305	openchat-3.5-0106OpenChat	1158.00	+/-14	1,726	OpenChat	Apache-2.0
306	starling-lm-7b-betaNexusflow	1158.00	+/-14	1,973	Nexusflow	Apache-2.0
307	WizardLM-70B-V1.0WizardLM Team	1157.00	+/-19	903	WizardLM Team	Llama 2 Community
308	DeepSeek LLM 67B ChatDeepSeek-AI	1155.00	+/-23	576	DeepSeek-AI	DeepSeek License
309	smollm2-1.7b-instructHuggingFace	1152.00	+/-33	271	HuggingFace	Apache 2.0
310	openhermes-2.5-mistral-7bNousResearch	1151.00	+/-20	697	NousResearch	Apache-2.0
311	Yi-34B零一万物	1151.00	+/-13	2,043	零一万物	Yi License
312	Phi-3-mini 3.8BMicrosoft Azure	1150.00	+/-12	2,564	Microsoft Azure	MIT
313	tulu-2-dpo-70bAllenAI/UW	1145.00	+/-19	888	AllenAI/UW	AI2 ImpACT Low-risk
314	Phi-3-mini 3.8BMicrosoft Azure	1139.00	+/-13	2,813	Microsoft Azure	MIT
315	llama-2-70b-chatMeta	1136.00	+/-10	4,740	Meta	Llama 2 Community
316	Mistral-7B-Instruct-v0.2MistralAI	1127.00	+/-12	2,605	MistralAI	Apache-2.0
317	starling-lm-7b-alphaUC Berkeley	1126.00	+/-16	1,300	UC Berkeley	CC-BY-NC-4.0
318	Qwen-14B-Chat阿里巴巴	1125.00	+/-24	534	阿里巴巴	Qianwen LICENSE
319	dolphin-2.2.1-mistral-7bCognitive Computations	1125.00	+/-32	219	Cognitive Computations	Apache-2.0
320	openchat-3.5OpenChat	1125.00	+/-18	945	OpenChat	Apache-2.0
321	llama-3.2-1b-instructMeta	1124.00	+/-16	1,162	Meta	Llama 3.2
322	Qwen1.5-7B-Chat阿里巴巴	1120.00	+/-20	690	阿里巴巴	Qianwen LICENSE
323	Gemma 7B - ItGoogle Research	1118.00	+/-16	1,120	Google Research	Gemma license
324	Vicuna 33BLM-SYS	1115.00	+/-13	2,663	LM-SYS	Non-commercial
325	PaLM 2Google Research	1115.00	+/-19	901	Google Research	Proprietary
326	llama2-70b-steerlm-chatNvidia	1114.00	+/-27	440	Nvidia	Llama 2 Community
327	Baichuan2-13B-Chat百川智能	1110.00	+/-13	2,218	百川智能	Llama 2 Community
328	CodeLLaMA-34BFacebook AI研究实验室	1109.00	+/-19	770	Facebook AI研究实验室	Llama 2 Community
329	solar-10.7b-instruct-v1.0Upstage AI	1109.00	+/-22	604	Upstage AI	CC-BY-NC-4.0
330	Gemma 1.1-2B-ITGoogle Research	1108.00	+/-16	1,355	Google Research	Gemma license
331	MPT-30B-ChatMosaicML	1095.00	+/-34	242	MosaicML	CC-BY-NC-SA-4.0
332	nous-hermes-2-mixtral-8x7b-dpoNousResearch	1093.00	+/-21	628	NousResearch	Apache-2.0
333	Baichuan2-7B-Chat百川智能	1086.00	+/-14	1,656	百川智能	Llama 2 Community
334	Qwen1.5-4B-Chat阿里巴巴	1086.00	+/-18	988	阿里巴巴	Qianwen LICENSE
335	stripedhyena-nous-7bTogether AI	1084.00	+/-20	676	Together AI	Apache 2.0
336	Vicuna 13BLM-SYS	1083.00	+/-14	2,146	LM-SYS	Llama 2 Community
337	zephyr-7b-betaHuggingFace	1082.00	+/-17	1,250	HuggingFace	MIT
338	Mistral 7B InstructMistralAI	1082.00	+/-19	974	MistralAI	Apache 2.0
339	guanaco-33bUW	1080.00	+/-32	280	UW	Non-commercial
340	Gemma 2B - ItGoogle Research	1070.00	+/-22	597	Google Research	Gemma license
341	wizardlm-13bMicrosoft	1064.00	+/-21	669	Microsoft	Llama 2 Community
342	olmo-7b-instructAi2	1054.00	+/-19	848	Ai2	Apache-2.0
343	Vicuna 7BLM-SYS	1047.00	+/-22	658	LM-SYS	Llama 2 Community
344	ChatGLM3-6B智谱AI	1042.00	+/-23	576	智谱AI	Apache-2.0
345	GPT4All 13BNomic AI	998.00	+/-37	211	Nomic AI	Non-commercial
346	alpaca-13bStanford	992.00	+/-23	652	Stanford	Non-commercial
347	MPT-7B-ChatMosaicML	985.00	+/-25	471	MosaicML	CC-BY-NC-SA-4.0
348	RWKV-4-Raven-14BRWKV	983.00	+/-24	544	RWKV	Apache 2.0
349	Koala达摩院	980.00	+/-21	751	达摩院	Non-commercial
350	ChatGLM-6B智谱AI	976.00	+/-26	525	智谱AI	Non-commercial
351	ChatGLM2-6B智谱AI	971.00	+/-35	227	智谱AI	Apache-2.0
352	oasst-pythia-12bOpenAssistant	960.00	+/-22	687	OpenAssistant	Apache 2.0
353	dolly-v2-12bDatabricks	950.00	+/-29	370	Databricks	MIT
354	fastchat-t5-3bLMSYS	919.00	+/-26	462	LMSYS	Apache 2.0
355	LLaMA 13BFacebook AI研究实验室	919.00	+/-33	252	Facebook AI研究实验室	Non-commercial
356	stablelm-tuned-alpha-7bStability AI	890.00	+/-29	353	Stability AI	CC-BY-NC-SA-4.0

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

FAQ

What is LMArena Math Arena?

LMArena Math Arena is an anonymous evaluation track focused on mathematical reasoning. Users submit real math questions, compare hidden model solutions side by side, and vote for the better answer; the leaderboard is then calculated with Elo-style scoring.

How is Math Arena different from MATH-500 or AIME?

Static benchmarks such as MATH-500 and AIME use fixed problem sets and automated grading. Math Arena uses open-ended user questions and human preference voting, making it a useful complement for measuring how models handle varied real-world math tasks.

Do thinking models perform better in Math Arena?

Models with extended reasoning or chain-of-thought style capabilities often rank higher on math tasks because they spend more time decomposing and checking solutions. That benefit can come with higher latency and cost.

How do China-developed models perform in math?

DeepSeek, Qwen, GLM, and related models have become competitive in math reasoning leaderboards. Open licenses and Chinese-language support can make them especially useful for local deployment and education scenarios.