Gemini 2.5 Pro Experimental 03-25 Benchmark Details

Gemini 2.5 Pro Experimental 03-25 currently shows benchmark results led by AIME 2024 (9 / 62, score 92), Aider-Polyglot (12 / 59, score 72.90), SimpleQA (12 / 45, score 52.90).

Benchmark Results

Gemini 2.5 Pro Experimental 03-25

Benchmark Results

General Knowledge

2 evaluations

Benchmark / mode

Score

Rank/total

GPQA Diamond

55 / 179

HLE

18.80

110 / 159

Common Sense

1 evaluations

Benchmark / mode

Score

Rank/total

SimpleQA

52.90

12 / 45

Coding and Software Engineer

2 evaluations

Benchmark / mode

Score

Rank/total

LiveCodeBench

70.40

53 / 120

SWE-bench Verified

63.80

72 / 108

Math and Reasoning

3 evaluations

Benchmark / mode

Score

Rank/total

AIME 2024

9 / 62

AIME2025

86.90

46 / 106

FrontierMath - Tier 4

Standard Mode

4.20

40 / 80

常识推理

1 evaluations

Benchmark / mode

Score

Rank/total

Simple Bench

Standard Mode

51.60

27 / 63

Agent Level Benchmark

1 evaluations

Benchmark / mode

Score

Rank/total

Aider-Polyglot

Standard Mode

72.90

12 / 59

Claw-style Agent Evaluation

2 evaluations

Benchmark / mode

Score

Rank/total

Claw Bench

Thinking EnabledTools

80.40

20 / 29

Pinch Bench

Thinking EnabledTools

71.90

29 / 37

Compare with other models