DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
HomeOverall LeaderboardLMArena Math Arena 数学推理能力排行榜

LMArena Tracks

Text GenerationCodingMathImage EditText-to-VideoImage-to-VideoText-to-Image

LMArena Math Arena 数学推理能力排行榜

基于 LMArena Math Arena 用户匿名投票的最新AI大模型数学推理能力排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。

Top Model

-

Top Score

-

Model Count

0

Data version

暂无数据

Data source: LM Arena

About This Leaderboard

This leaderboard ranks AI models by mathematical reasoning ability. Data comes from LMArena's Math sub-track, evaluated through anonymous blind testing by real users on math problem-solving tasks.

Methodology Overview

Blind testing: Users submit math problems, two anonymous models provide solutions, and users vote for the better answer — eliminating brand bias.

Elo scoring: Uses the Bradley-Terry model to calculate Elo scores. Higher scores mean users more frequently prefer that model's math solutions.

Broad scenario coverage: Testing spans algebra, geometry, calculus, competition math, and more diverse real-world math tasks.

DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.

Filters

Leaderboard snapshot month:

Ranking Table

RankModelScore95% CIVotesOrganizationLicense
No data available

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

常见问题 (FAQ)

什么是 LMArena Math Arena?▼
LMArena Math Arena 是 LMArena 旗下专注于数学推理能力的匿名评测平台。用户提交真实数学问题(如代数、几何、竞赛数学等),系统将不同模型的解题过程并排展示(隐藏模型名称),由用户投票选出更好的解答,最终通过 Elo 算法汇总形成动态排行榜。
Math Arena 与 MATH-500、AIME 等静态基准有什么区别?▼
MATH-500、AIME、AMC 等静态基准使用固定题目集和自动评分,可重现性强但容易被针对性优化("刷榜")。Math Arena 来自真实用户的开放式数学问题,测试内容不固定,更能反映模型在实际数学场景中的自然表现,两者互为补充。
思考模型(Thinking Model)在数学 Arena 中表现更好吗?▼
整体而言,具备思维链(Chain-of-Thought)或扩展推理能力的模型在数学 Arena 中往往排名更高。Claude Opus 系列 Thinking 模式、GPT 高算力模式以及 DeepSeek 思考版本均在榜单前列,说明延长推理时间对数学问题的解答质量有显著提升。
国产大模型在数学能力方面表现如何?▼
DeepSeek、Qwen3 系列、GLM 等国产模型在 Math Arena 表现亮眼,已跻身全球前列。DeepSeek 以 MIT 协议开源,Qwen3-235B 等系列支持中文数学场景,是选择开源数学推理模型的重要参考。