Gemini 3.1 Pro PreviewvsGemini 3.0 Pro (Preview 11-2025)

Across 15 shared benchmarks, Gemini 3.1 Pro Preview leads overall: Gemini 3.1 Pro Preview wins 12, Gemini 3.0 Pro (Preview 11-2025) wins 3, with 0 ties and an average score difference of +7.84.

Gemini 3.1 Pro Preview

Google Deep Mind · 2026-02-20 · Multimodal model

Gemini 3.0 Pro (Preview 11-2025)

Google Deep Mind · 2025-11-18 · Multimodal model

Gemini 3.1 Pro Preview12 wins(80%)(20%)3 winsGemini 3.0 Pro (Preview 11-2025)

Benchmark scores

Grouped by capability, sorted by largest gap within each. 15 shared benchmarks.

General Knowledge

Gemini 3.1 Pro Preview 4/4

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
ARC-AGI-2	77.109 / 62Thinking High (No Tools)	45.1026 / 62	+32
LiveBench	79.933 / 115Thinking High (No Tools)	73.3924 / 115Thinking High (No Tools)	+6.54
HLE	51.4022 / 172Thinking High (With Tools)	45.8040 / 172	+5.60
GPQA Diamond	94.303 / 187Thinking High (No Tools)	93.805 / 187	+0.50

Agent Level Benchmark

Gemini 3.1 Pro Preview 2/2

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
τ²-Bench	90.802 / 43Thinking High (With Tools)	85.408 / 43	+5.40
τ²-Bench - Telecom	99.301 / 35Thinking High (With Tools)	985 / 35	+1.30

AI Agent - Tool Usage

Gemini 3.1 Pro Preview 2/2

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
Terminal Bench 2.0	68.508 / 47Thinking High (With Tools)	56.9025 / 47	+11.60
MCP-Atlas	78.209 / 27Thinking High (With Tools)	70.3015 / 27Normal (With Tools)	+7.90

Coding and Software Engineer

Even 2/2

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
SWE-bench Verified	80.6011 / 112Thinking High (With Tools)	76.2036 / 112	+4.40
LiveCodeBench	91.703 / 123Thinking High (With Tools)	922 / 123	-0.30

Math and Reasoning

Gemini 3.0 Pro (Preview 11-2025) 2/2

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
FrontierMath - Tier 4	16.7020 / 80Normal (No Tools)	18.8016 / 80	-2.10
FrontierMath	36.9011 / 60Thinking High (No Tools)	3810 / 60	-1.10

AI Agent - Information Search

Gemini 3.1 Pro Preview 1/1

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
BrowseComp	85.905 / 53Thinking High (With Tools + Internet)	59.2038 / 53	+26.70

Claw-style Agent Evaluation

Gemini 3.1 Pro Preview 1/1

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
Pinch Bench	86.7010 / 37Thinking (With Tools)	70.7031 / 37Thinking (With Tools)	+16

Commonsense Reasoning

Gemini 3.1 Pro Preview 1/1

Benchmark	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)	Diff
Simple Bench	79.602 / 63Normal (No Tools)	76.405 / 63Thinking (No Tools)	+3.20

Specs

Field	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)
Publisher	Google Deep Mind	Google Deep Mind
Release date	2026-02-20	2025-11-18
Model type	Multimodal model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1M	1000K
Max output	64K	64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Gemini 3.1 Pro Preview	Gemini 3.0 Pro (Preview 11-2025)
Text input	$2 / 1M tokens	$2 / 1M tokens
Text output	$12 / 1M tokens	$12 / 1M tokens

Summary

Gemini 3.1 Pro Previewleads in:General Knowledge (4/4), Agent Level Benchmark (2/2), AI Agent - Tool Usage (2/2), AI Agent - Information Search (1/1), Claw-style Agent Evaluation (1/1), Commonsense Reasoning (1/1)
Gemini 3.0 Pro (Preview 11-2025)leads in:Math and Reasoning (2/2)
Tied in:Coding and Software Engineer

On average across the 15 shared benchmarks, Gemini 3.1 Pro Preview scores 7.84 higher.

Largest single-benchmark gap: ARC-AGI-2 — Gemini 3.1 Pro Preview 77.10 vs Gemini 3.0 Pro (Preview 11-2025) 45.10 (+32).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Gemini 3.1 Pro Preview details Gemini 3.0 Pro (Preview 11-2025) details·Customize in compare tool