Gemini 2.5 Flash Benchmark Details

Gemini 2.5 Flash currently shows benchmark results led by AIME 2024 (16 / 62, score 88), GPQA Diamond (63 / 179, score 82.80), FrontierMath - Tier 4 (40 / 80, score 4.20).

Benchmark Results

Gemini 2.5 Flash

Benchmark Results

Thinking
Tool usage

General Knowledge

6 evaluations
Benchmark / mode
Score
Rank/total
82.80
63 / 179
78.30
83 / 179
64.35
35 / 52
32.30
51 / 65
11
129 / 159
8.40
140 / 159

Common Sense

2 evaluations
Benchmark / mode
Score
Rank/total
26.90
27 / 45
25.80
28 / 45

Coding and Software Engineer

4 evaluations
Benchmark / mode
Score
Rank/total
55.40
81 / 120
41.10
98 / 120
48.90
94 / 108

Math and Reasoning

5 evaluations
Benchmark / mode
Score
Rank/total
88
16 / 62
72
70 / 106
61.60
81 / 106
7.80
6 / 10
4.20
40 / 80

Agent Level Benchmark

1 evaluations
Benchmark / mode
Score
Rank/total
56.70
20 / 26

Claw-style Agent Evaluation

1 evaluations
Benchmark / mode
Score
Rank/total
Pinch Bench
Thinking EnabledTools
70.70
31 / 37