Grok-3 - Reasoning Beta Benchmark Details

Grok-3 - Reasoning Beta currently shows benchmark results led by AIME 2024 (6 / 62, score 93.30), LiveCodeBench (33 / 120, score 79.40), GPQA Diamond (52 / 179, score 84.60).

Benchmark Results

Grok-3 - Reasoning Beta

Benchmark Results

Thinking

General Knowledge

1 evaluations
Benchmark / mode
Score
Rank/total
84.60
52 / 179

Math and Reasoning

1 evaluations
Benchmark / mode
Score
Rank/total
93.30
6 / 62

Coding and Software Engineer

1 evaluations
Benchmark / mode
Score
Rank/total
79.40
33 / 120