Gemini 3.0 FlashvsGemini 2.5 Flash
Across 7 shared benchmarks, Gemini 3.0 Flash leads overall: Gemini 3.0 Flash wins 6, Gemini 2.5 Flash wins 0, with 1 ties and an average score difference of +23.06.
Gemini 3.0 Flash
Google Deep Mind · 2025-12-17 · AI model
Gemini 2.5 Flash
Google Deep Mind · 2025-04-17 · Reasoning model
Gemini 3.0 Flash6 wins(86%)Ties1(0%)0 winsGemini 2.5 Flash
Benchmark scores
Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.
General Knowledge
Gemini 3.0 Flash 2/2| Benchmark | Gemini 3.0 Flash | Gemini 2.5 Flash | Diff |
|---|---|---|---|
| HLE | 43.5033 / 150thinking + 使用工具 | 8.40131 / 150 | +35.10 |
| GPQA Diamond | 90.4015 / 175thinking | 78.3079 / 175 | +12.10 |
Math and Reasoning
Gemini 3.0 Flash 1/2| Benchmark | Gemini 3.0 Flash | Gemini 2.5 Flash |
|---|
Specs
| Field | Gemini 3.0 Flash | Gemini 2.5 Flash |
|---|---|---|
| Publisher | Google Deep Mind | Google Deep Mind |
| Release date | 2025-12-17 | 2025-04-17 |
| Model type | AI model | Reasoning model |
| Architecture | Dense | Dense |
| Parameters | 0.0 | Not available |
| Context length | 2000K | 1000K |
| Max output | 65536 | 65536 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | Gemini 3.0 Flash | Gemini 2.5 Flash |
|---|---|---|
| Text input | 0.5 美元/100万 tokens | 0.15 美元/ 100万 tokens |
| Text output | 3 美元/100万 tokens | 0.6 美元/ 100万 tokens |
| Cache read | 0.05 美元/100万 tokens | Not public |
Summary
- Gemini 3.0 Flashleads in:General Knowledge (2/2), Math and Reasoning (1/2), Claw-style Agent Evaluation (1/1), Coding and Software Engineer (1/1), Common Sense (1/1)
On average across the 7 shared benchmarks, Gemini 3.0 Flash scores 23.06 higher.
Largest single-benchmark gap: SimpleQA — Gemini 3.0 Flash 68.70 vs Gemini 2.5 Flash 25.80 (+42.90).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.