Opus 4.7vsGemini 3.1 Pro Preview

Across 16 shared benchmarks, Opus 4.7 leads overall: Opus 4.7 wins 8, Gemini 3.1 Pro Preview wins 7, with 1 ties and an average score difference of +0.19.

Opus 4.7

Anthropic · 2026-04-16 · Reasoning model

Gemini 3.1 Pro Preview

Google Deep Mind · 2026-02-20 · Multimodal model

Opus 4.78 wins(50%)Ties1(44%)7 winsGemini 3.1 Pro Preview

Benchmark scores

Grouped by capability, sorted by largest gap within each. 16 shared benchmarks.

General Knowledge

Gemini 3.1 Pro Preview 4/6

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
HLE	54.7013 / 172Extended (with tools)	51.4022 / 172Thinking High (With Tools)	+3.30
LiveBench	76.917 / 115Deep Thinking (No Tools)	79.933 / 115Thinking High (No Tools)	-3.02
ARC-AGI-2	75.8011 / 62最高（无工具）	77.109 / 62Thinking High (No Tools)	-1.30
MMLU	91.506 / 66Normal (No Tools)	92.603 / 66Thinking High (No Tools)	-1.10
GPQA Diamond	94.204 / 187Extended (no tools)	94.303 / 187Thinking High (No Tools)	-0.10
ARC-AGI-3	08 / 9Thinking High (No Tools)	06 / 9Thinking High (No Tools)	—

AI Agent - Tool Usage

Opus 4.7 3/4

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
TerminalBench 2.1	69.7019 / 27Thinking High (With Tools)	73.8016 / 27Thinking High (With Tools)	-4.10
OSWorld-Verified	7810 / 24Extended (with tools)	76.2011 / 24Thinking (With Tools)	+1.80
MCP-Atlas	79.107 / 27Deep Thinking (With Tools)	78.209 / 27Thinking High (With Tools)	+0.90
Terminal Bench 2.0	69.406 / 47Extended (with tools)	68.508 / 47Thinking High (With Tools)	+0.90

Coding and Software Engineer

Opus 4.7 2/2

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
SWE-Bench Pro - Public	64.307 / 54Extended (with tools)	54.2032 / 54Thinking High (With Tools)	+10.10
SWE-bench Verified	87.606 / 112Extended (with tools)	80.6011 / 112Thinking High (With Tools)	+7

Math and Reasoning

Opus 4.7 2/2

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
FrontierMath	43.806 / 60极高强度思考（无工具）	36.9011 / 60Thinking High (No Tools)	+6.90
FrontierMath - Tier 4	22.9012 / 80极高强度思考（无工具）	16.7020 / 80Normal (No Tools)	+6.20

AI Agent - Information Search

Gemini 3.1 Pro Preview 1/1

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
BrowseComp	79.3017 / 53Extended (with tools)	85.905 / 53Thinking High (With Tools + Internet)	-6.60

Commonsense Reasoning

Gemini 3.1 Pro Preview 1/1

Benchmark	Opus 4.7	Gemini 3.1 Pro Preview	Diff
Simple Bench	61.7013 / 63Normal (No Tools)	79.602 / 63Normal (No Tools)	-17.90

Specs

Field	Opus 4.7	Gemini 3.1 Pro Preview
Publisher	Anthropic	Google Deep Mind
Release date	2026-04-16	2026-02-20
Model type	Reasoning model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1000K	1M
Max output	128K	64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Opus 4.7	Gemini 3.1 Pro Preview
Text input	$5 / 1M tokens	$2 / 1M tokens
Text output	$25 / 1M tokens	$12 / 1M tokens
Cache read	$0.5 / 1M tokens	Not public
Cache write	$6.25 / 1M tokens	Not public

Summary

Opus 4.7leads in:AI Agent - Tool Usage (3/4), Coding and Software Engineer (2/2), Math and Reasoning (2/2)
Gemini 3.1 Pro Previewleads in:General Knowledge (4/6), AI Agent - Information Search (1/1), Commonsense Reasoning (1/1)

On average across the 16 shared benchmarks, Opus 4.7 scores 0.19 higher.

Largest single-benchmark gap: Simple Bench — Opus 4.7 61.70 vs Gemini 3.1 Pro Preview 79.60 (-17.90).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Opus 4.7 details Gemini 3.1 Pro Preview details·Customize in compare tool