Gemini 3.1 Pro Preview vs Gemini 2.5 Pro Experimental 03-25 评测对比

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

Gemini 3.1 Pro Preview

Google Deep Mind

Release: 2026-02-20
Context length: 1M
Parameters: Not provided
最大输出: 32,768 tokens

Model profile·Playground

Capability profile

Each axis is a category average, normalized to a 100-point radar.

View: Non-parallel mode average·3 dimensions

Gemini 3.1 Pro Preview

Relative edge: 综合评估 +17.9 / Relative gap: none clear

Gemini 2.5 Pro Experimental 03-25

Relative edge: none clear / Relative gap: 综合评估 -17.9

Method: for each model and benchmark, the chart first averages all scores in the current mode scope instead of taking the best score, then averages those benchmark scores within each category. Only benchmarks with at least two selected models scored are included; missing values are not counted as zero.

Best overall

Gemini 3.1 Pro Preview · 60.52

Best single

Gemini 3.1 Pro Preview · GPQA Diamond 94.30

Modality coverage

Gemini 2.5 Pro Experimental 03-25 · 4 modalities

Head to head

Gemini 3.1 Pro Preview

Gemini 2.5 Pro Experimental 03-25

AheadTiedBehind

Benchmarks

Wins

Losses

+15.80

Average diff

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Filter: Best Available·2 modes · 4 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

4 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.

Benchmark	Gemini 3.1 Pro Preview	Gemini 2.5 Pro Experimental 03-25
GPQA Diamond 综合评估	94.30Thinking Level · High	84.00Standard Mode
HLE 综合评估	44.40Thinking Level · High	18.80Standard Mode
FrontierMath - Tier 4 数学推理	16.70Standard Mode	4.20Standard Mode
Pinch Bench OpenClaw智能体能力综合测评	86.70Thinking Enabled ｜ Tools	71.90Thinking Enabled ｜ Tools

API price comparison

Side-by-side input/output token pricing

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	Gemini 3.1 Pro PreviewGoogle Deep Mind	Gemini 2.5 Pro Experimental 03-25Google Deep Mind
Core specsRelease	2026-02-20	2025-03-25
Context length	1M	2000K
Max output	32768	65536
MoE	No	No
LicenseCode Open Source	Not provided	Not provided
Weights Open Source	Not provided	Not provided
Commercial use	不开源	不开源
Modality supportText Input/Output	Not provided	/
Image Input/Output	Not provided	/
Audio Input/Output	Not provided	/
Video Input/Output	Not provided	/
ResourcesPaper / report	Gemini 3.1 Pro: A smarter model for your most complex tasks	Gemini 2.5: Our most intelligent AI model