DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
  1. Home
  2. /
  3. Benchmarks
  4. /
  5. MATH-500

MATH-500

Updated May 2, 2026·2,659 views
Current SOTA
Google Deep Mind
Gemini-2.5-Pro-Preview-05-06
Google Deep Mind
98.80Score
Problem Count
500
Institution
OpenAI
Category
Math and Reasoning
Metrics
Accuracy
Language
English
Difficulty
Mixed

Overview

MATH-500 is an AI benchmark used to evaluate model capabilities. Review its overview, metrics, official resources, and model leaderboard results on DataLearnerAI.

Related resources

  • Official Website
  • DataLearner Blog

Latest MATH-500 model rankings and full benchmark leaderboard

Browse the latest scores, model modes, release dates, and parameter sizes for MATH-500.

Source: DataLearnerAI

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model Mode Legend
License:
Origin:
Model release cutoff:

MATH-500 Rank

RankModelLicense
Google Deep Mind
Gemini-2.5-Pro-Preview-05-06
Standard Mode
98.80
2025-05-06UnknownClosed
Google Deep Mind
Gemini 2.5-Pro
Standard Mode
98.80
2025-06-05UnknownClosed
Anthropic
Claude Opus 4
Standard Mode
98.20
2025-05-23UnknownClosed
4
智谱AI
GLM-4.5
Thinking Enabled
98.20
2025-07-28355BFree Commercial
5
OpenAI
OpenAI o3
Standard Mode
98.10
2025-04-16UnknownClosed
6
智谱AI
GLM-4.5-Air
Thinking Enabled
98.10
2025-07-28106BFree Commercial
7
阿里巴巴
Qwen3-235B-A22B
Thinking Enabled
98.00
2025-04-28235BFree Commercial
8
DeepSeek-AI
DeepSeek-R1-0528
Thinking Enabled
98.00
2025-05-28671BFree Commercial
9
OpenAI
OpenAI o3-mini (high)
Standard Mode
97.90
2025-01-31UnknownClosed
10
Anthropic
Claude Opus 4.6
Extended Thinking
97.60
2026-02-05UnknownClosed
11
阿里巴巴
Qwen3-8B
Thinking Enabled
97.40
2025-04-288BFree Commercial
12
Moonshot AI
Kimi K2
Standard Mode
97.40
2025-07-111000BFree Commercial
13
DeepSeek-AI
DeepSeek-R1
Standard Mode
97.30
2025-01-20671BFree Commercial
14
阿里巴巴
Qwen3-32B
Thinking Enabled
97.20
2025-04-2832BFree Commercial
15
MiniMaxAI
MiniMax-M1-80k
Standard Mode
96.80
2025-06-16456BFree Commercial
16
华为
Pangu Pro MoE
Standard Mode
96.80
2025-06-3071.9BFree Commercial
17
OpenAI
OpenAI o1
Standard Mode
96.40
2024-12-05UnknownClosed
18
百度
ERNIE-4.5-300B-A47B
Standard Mode
96.40
2025-06-30300BFree Commercial
19
普林斯顿大学
Kimi k1.5 (Long-CoT)
Standard Mode
96.20
2025-01-22UnknownClosed
20
Anthropic
Claude Sonnet 3.7-64K Extended Thinking
Standard Mode
96.20
2025-02-25UnknownClosed
21
腾讯AI实验室
Hunyuan-T1
Standard Mode
96.20
2025-03-21UnknownClosed
22
阿里巴巴
Qwen3-235B-A22B
Standard Mode
96.20
2025-04-28235BFree Commercial
23
MiniMaxAI
MiniMax-M1-40k
Standard Mode
96.00
2025-06-16456BFree Commercial
24
OpenAI
OpenAI o3-mini
Thinking Enabled
95.80
2025-01-31UnknownClosed
25
Facebook AI研究实验室
Llama 4 Behemoth Instruct
Standard Mode
95.00
2025-04-052000BFree Commercial
26
Moonshot AI
Kimi k1.5 (Short-CoT)
Standard Mode
94.60
2025-01-22UnknownClosed
27
DeepSeek-AI
DeepSeek-R1-Distill-Llama-70B
Standard Mode
94.50
2025-01-2070BFree Commercial
28
DeepSeek-AI
DeepSeek-V3-0324
Standard Mode
94.00
2025-03-24671BFree Commercial
29
Tencent ARC
Hunyuan-7B
Standard Mode
93.70
2025-08-047BFree Commercial
30
OpenAI
GPT-4.1
Standard Mode
92.80
2025-04-14UnknownClosed
Scroll to load 14 more