GPT-4o mini Benchmark Details

GPT-4o mini currently shows benchmark results led by MBPP (4 / 28, score 87.20), MATH (19 / 42, score 70.20), GSM8K (12 / 26, score 91.30).

Benchmark Results

GPT-4o mini

Benchmark Results

Thinking
Tool usage

General Knowledge

4 evaluations
Benchmark / mode
Score
Rank/total
82
46 / 65
61.70
104 / 126
41.10
164 / 179
40.20
11 / 14

Math and Reasoning

2 evaluations
Benchmark / mode
Score
Rank/total
91.30
12 / 26
70.20
19 / 42

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
87.20
19 / 39
87.20
4 / 28

Common Sense

1 evaluations
Benchmark / mode
Score
Rank/total
9.50
41 / 45

常识推理

1 evaluations
Benchmark / mode
Score
Rank/total
Simple Bench
Standard Mode
10.70
63 / 63

Agent Level Benchmark

1 evaluations
Benchmark / mode
Score
Rank/total
Aider-Polyglot
Standard Mode
3.60
59 / 59

Claw-style Agent Evaluation

1 evaluations
Benchmark / mode
Score
Rank/total
Pinch Bench
Thinking EnabledTools
75
27 / 37