Gemini 3.1 Pro Preview vs Gemini 2.5-Pro 评测对比

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

Gemini 3.1 Pro Preview

Google Deep Mind

Release: 2026-02-20
Context length: 1M
Parameters: Not provided
最大输出: 32,768 tokens

Model profile·Playground

Capability profile

Each axis is a category average, normalized to a 100-point radar.

View: Non-parallel mode average·4 dimensions

Gemini 3.1 Pro Preview

Relative edge: AI Agent - 信息收集 +78.1 / Relative gap: 多模态理解 -1.5

Gemini 2.5-Pro

Relative edge: 多模态理解 +1.5 / Relative gap: AI Agent - 信息收集 -78.1

Method: for each model and benchmark, the chart first averages all scores in the current mode scope instead of taking the best score, then averages those benchmark scores within each category. Only benchmarks with at least two selected models scored are included; missing values are not counted as zero.

Best overall

Gemini 3.1 Pro Preview · 62.26

Best single

Gemini 3.1 Pro Preview · GPQA Diamond 94.30

Modality coverage

Gemini 2.5-Pro · 2 modalities

Head to head

Gemini 3.1 Pro Preview

Gemini 2.5-Pro

AheadTiedBehind

Benchmarks

Wins

Losses

+31.43

Average diff

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Filter: Best Available·2 modes · 7 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

7 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.

Benchmark	Gemini 3.1 Pro Preview	Gemini 2.5-Pro
ARC-AGI-2 综合评估	77.10Thinking Level · High	4.90Thinking Enabled
GPQA Diamond 综合评估	94.30Thinking Level · High	86.40Thinking Enabled
HLE 综合评估	44.40Thinking Level · High	21.60Thinking Enabled
MMMU 多模态理解	80.50Thinking Level · High	82.00Thinking Enabled
FrontierMath 数学推理	36.90Thinking Level · High	11.00Standard Mode
FrontierMath - Tier 4 数学推理	16.70Standard Mode	2.10Standard Mode
BrowseComp AI Agent - 信息收集	85.90Thinking Level · High ｜ Tools	7.80Thinking Enabled ｜ Tools

API price comparison

Side-by-side input/output token pricing

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	Gemini 3.1 Pro PreviewGoogle Deep Mind	Gemini 2.5-ProGoogle Deep Mind
Core specsRelease	2026-02-20	2025-06-05
Context length	1M	1000K
Max output	32768	65536
MoE	No	No
Supported modes	No mode data	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）
LicenseCode Open Source	Not provided	Not provided
Weights Open Source	Not provided	Not provided
Commercial use	不开源	不开源
Modality supportText Input/Output	Not provided	/
Image Input/Output	Not provided	/
ResourcesPaper / report	Gemini 3.1 Pro: A smarter model for your most complex tasks	Try the latest Gemini 2.5 Pro before general availability.
DataLearner blog	Not provided	Google发布Gemini 2.5 Pro: Gemini系列第一个2.5版本的模型，最高支持200万上下文，全模态输入，推理大模型，LMArena排名第一