Benchmark Results
Benchmark Results
综合评估
4 evaluations编程与软件工程
2 evaluations数学推理
3 evaluationsAI Agent - 工具使用
2 evaluationsOpenClaw智能体能力综合测评
2 evaluationsCompetitor Comparison
Benchmark scores for Gemini 3.0 Flash compared against top models in its class
9 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Gemini 3.0 FlashCurrent | Claude Sonnet 4 |
|---|---|---|
ARC-AGI-2 综合评估 | 33.60Thinking Enabled | 5.90Thinking Enabled |
GPQA Diamond 综合评估 | 90.40Thinking Enabled | 83.80Deep Thinking Mode | Tools |
HLE 综合评估 | 43.50Thinking Enabled | Tools | 9.60Thinking Enabled |
SWE-Bench Pro - Public 编程与软件工程 | 49.60Thinking Level · High | Tools | 42.70Thinking Enabled |
SWE-bench Verified 编程与软件工程 | 68.70Thinking Enabled | 80.20Thinking Enabled | Tools |
AIME2025 数学推理 | 99.70Thinking Enabled | Tools | 85.00Deep Thinking Mode | Tools |
τ²-Bench Agent能力评测 | 90.20Thinking Enabled | Tools | 52.00Standard Mode | Tools |
Claw Bench OpenClaw智能体能力综合测评 | 85.70Thinking Enabled | Tools | 77.80Thinking Enabled | Tools |
Pinch Bench OpenClaw智能体能力综合测评 | 85.20Thinking Enabled | Tools | 80.50Thinking Enabled | Tools |
Standard API Pricing: Gemini 3.0 Flash vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
These models use different currencies or billing units, so the page falls back to raw price values instead of a shared bar chart.
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Gemini 3.0 Flash | — | 0.5 美元/100万 tokens | 3 美元/100万 tokens | — |
Claude Sonnet 4 | — | 3 美元/ 100万tokens | 15 美元/100万tokens | — |
Version History
How each version of the Gemini 3.0 Flash series stacks up on benchmark tests
7 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | Gemini 3.0 FlashCurrent | Gemini 2.5 Flash | Gemini 2.0 Flash Experimental |
|---|---|---|---|
GPQA Diamond 综合评估 | 90.40Thinking Enabled | 82.80Thinking Enabled | 65.20Standard Mode |
HLE 综合评估 | 43.50Thinking Enabled | Tools | 11.00Thinking Enabled | 5.10Standard Mode |
SimpleQA 常识问答 | 68.70Thinking Enabled | 26.90Thinking Enabled | 29.90Standard Mode |
SWE-bench Verified 编程与软件工程 | 68.70Thinking Enabled | 50.00Standard Mode | 21.40Standard Mode |
AIME2025 数学推理 | 99.70Thinking Enabled | Tools | 72.00Thinking Enabled | 29.70Standard Mode |
4.20Standard Mode | 4.20Standard Mode | -- | |
Pinch Bench OpenClaw智能体能力综合测评 | 85.20Thinking Enabled | Tools | 70.70Thinking Enabled | Tools | -- |
Single-Benchmark Version Trend
Viewing: GPQA Diamond · 综合评估
Standard API Pricing Across the Gemini 3.0 Flash Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
These models use different currencies or billing units, so the page falls back to raw price values instead of a shared bar chart.
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Gemini 3.0 Flash | — | 0.5 美元/100万 tokens | 3 美元/100万 tokens | — |
Gemini 2.5 Flash | — | 0.15 美元/ 100万 tokens | 0.6 美元/ 100万 tokens | — |
Gemini 2.0 Flash Experimental | — | 0.10 美元/ 100万 tokens | 0.40 美元/ 100万 tokens | — |