Grok 4 Benchmark Details

Grok 4 currently shows benchmark results led by IMO 2024 (1 / 10, score 23.20), IMO 2025 (1 / 9, score 29.20), MMLU Pro (14 / 126, score 87).

Benchmark Results

Grok 4

Benchmark Results

Thinking

General Knowledge

8 evaluations
Benchmark / mode
Score
Rank/total
87
14 / 126
87
39 / 179
66.70
29 / 65
LiveBench
Standard Mode
62.02
59 / 115
38.60
55 / 159
38.60
55 / 159
25.40
88 / 159
15.90
34 / 59

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
82
25 / 120
58.60
79 / 108

Math and Reasoning

9 evaluations
Benchmark / mode
Score
Rank/total
98.80
13 / 106
91.70
36 / 106
46.70
4 / 16
23.30
10 / 16
29.20
1 / 9
23.20
1 / 10
12.10
22 / 60
2.10
56 / 80

AI Agent - Tool Usage

1 evaluations
Benchmark / mode
Score
Rank/total

常识推理

1 evaluations
Benchmark / mode
Score
Rank/total
Simple Bench
Thinking Enabled
60.50
15 / 63

Agent Level Benchmark

2 evaluations
Benchmark / mode
Score
Rank/total
Aider-Polyglot
Thinking Level · High
79.60
7 / 59