DataLearnerAI
Toggle menu
Latest AI Insights
Model Evaluations
Model Directory
Model Comparison
Resource Center
Tool Directory
Search blog
中
EN
加载中...
Model catalog
Gemini 3.0 Pro (Preview 11-2025)
Benchmark analysis
Gemini 3.0 Pro (Preview 11-2025) Benchmark Analysis
Google Deep Mind
Updated 2/22/2026
17 views
Share
In-depth Analysis
谷歌发布的Gemini 3.0系列中最强的模型
Benchmark Results
Gemini 3.0 Pro (Preview 11-2025)
Benchmark Results
Tool usage
All tools
With tools
No tools
综合评估
13 evaluations
Benchmark / mode
Score
Rank/total
GPQA Diamond
Parallel thinking
93.80
2 / 153
GPQA Diamond
Thinking
91.90
5 / 153
GPQA Diamond
Thinking·High
91
7 / 153
MMLU Pro
Thinking
90
2 / 112
ARC-AGI
Parallel thinking
87.50
5 / 42
ARC-AGI
Thinking
75
9 / 42
LiveBench
Thinking
74.14
9 / 52
HLE
Thinking·High + With tools
45.80
11 / 105
ARC-AGI-2
Parallel thinking
45.10
10 / 34
HLE
Parallel thinking
41
20 / 105
HLE
Thinking
37.50
24 / 105
HLE
Thinking·High
37.20
25 / 105
ARC-AGI-2
Thinking
31.10
13 / 34
常识问答
1 evaluations
Benchmark / mode
Score
Rank/total
SimpleQA
Thinking
72.10
5 / 44
编程与软件工程
2 evaluations
Benchmark / mode
Score
Rank/total
LiveCodeBench
Thinking
92
1 / 103
SWE-bench Verified
Thinking
76.20
17 / 87
数学推理
4 evaluations
Benchmark / mode
Score
Rank/total
AIME2025
Thinking
95
24 / 105
AIME 2026
Thinking
90.60
7 / 7
FrontierMath
Thinking
38
2 / 52
FrontierMath - Tier 4
Thinking
18.80
2 / 32
常识推理
1 evaluations
Benchmark / mode
Score
Rank/total
Simple Bench
Thinking
76.40
1 / 27
Agent能力评测
4 evaluations
Benchmark / mode
Score
Rank/total
τ²-Bench - Telecom
Thinking·High + With tools
98
4 / 29
τ²-Bench
Thinking + With tools
85.40
6 / 34
Terminal Bench Hard
Thinking·High + With tools
42
6 / 14
Terminal Bench Hard
Thinking + With tools
39
7 / 14
指令跟随
2 evaluations
Benchmark / mode
Score
Rank/total
IF Bench
Thinking·High + With tools
70
6 / 25
IF Bench
Thinking
70
6 / 25
AI Agent - 信息收集
1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
Thinking·High + With tools
59.20
15 / 27
AI Agent - 工具使用
2 evaluations
Benchmark / mode
Score
Rank/total
Terminal Bench 2.0
Thinking·High + With tools
56.90
7 / 20
Terminal Bench 2.0
Thinking + With tools
54.20
8 / 20
生产力知识
1 evaluations
Benchmark / mode
Score
Rank/total
GDPval-AA
Thinking·High
35
8 / 11
长上下文能力
1 evaluations
Benchmark / mode
Score
Rank/total
AA-LCR
Thinking·High
71
1 / 12
与其他模型对比
References
artificialanalysis.ai