Gemini 2.0 Flash Experimental Benchmark Details

Gemini 2.0 Flash Experimental currently shows benchmark results led by SimpleQA (23 / 45, score 29.90), MMLU Pro (75 / 126, score 76.24), MMLU (44 / 65, score 83.40).

Benchmark Results

Gemini 2.0 Flash Experimental

Benchmark Results

Thinking

General Knowledge

4 evaluations
Benchmark / mode
Score
Rank/total
83.40
44 / 65
76.24
75 / 126
65.20
130 / 179
5.10
154 / 159

Common Sense

1 evaluations
Benchmark / mode
Score
Rank/total
29.90
23 / 45

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
29.10
117 / 120
21.40
108 / 108

Math and Reasoning

1 evaluations
Benchmark / mode
Score
Rank/total
29.70
100 / 106

常识推理

1 evaluations
Benchmark / mode
Score
Rank/total
Simple Bench
Standard Mode
18.90
59 / 63

Agent Level Benchmark

2 evaluations
Benchmark / mode
Score
Rank/total
Aider-Polyglot
Standard Mode
38.20
40 / 59
Aider-Polyglot
Thinking Enabled
18.20
50 / 59