Gemini 3.5 FlashvsOpus 4.7

Gemini 3.5 Flash and Opus 4.7 are tied across 8 shared benchmarks: Gemini 3.5 Flash leads on 4, Opus 4.7 leads on 4, with 0 ties and an average score difference of -0.36.

Gemini 3.5 Flash

Google Deep Mind · 2026-06-20 · Multimodal model

Opus 4.7

Anthropic · 2026-04-16 · Reasoning model

Gemini 3.5 Flash4 wins(50%)(50%)4 winsOpus 4.7

Benchmark scores

Grouped by capability, sorted by largest gap within each. 8 shared benchmarks.

AI Agent - Tool Usage

Gemini 3.5 Flash 3/3

Benchmark	Gemini 3.5 Flash	Opus 4.7	Diff
TerminalBench 2.1	76.208 / 16Thinking High (With Tools)	69.7011 / 16Thinking High (With Tools)	+6.50
MCP-Atlas	83.601 / 23Thinking High (With Tools)	79.105 / 23Deep Thinking (With Tools)	+4.50
OSWorld-Verified	78.406 / 19Thinking High (With Tools)	787 / 19Extended (with tools)	+0.40

General Knowledge

Opus 4.7 3/3

Benchmark	Gemini 3.5 Flash	Opus 4.7	Diff
HLE	40.2055 / 161Thinking High (With Tools)	54.709 / 161Extended (with tools)	-14.50
ARC-AGI-2	72.1011 / 59Thinking High (With Tools)	75.809 / 59最高（无工具）	-3.70
LiveBench	75.0217 / 115Thinking High (No Tools)	76.917 / 115Deep Thinking (No Tools)	-1.89

Coding and Software Engineer

Opus 4.7 1/1

Benchmark	Gemini 3.5 Flash	Opus 4.7	Diff
SWE-Bench Pro - Public	55.1021 / 44Thinking High (With Tools)	64.304 / 44Extended (with tools)	-9.20

Math and Reasoning

Gemini 3.5 Flash 1/1

Benchmark	Gemini 3.5 Flash	Opus 4.7	Diff
Simple Bench	76.704 / 63Normal (No Tools)	61.7013 / 63Normal (No Tools)	+15

Specs

Field	Gemini 3.5 Flash	Opus 4.7
Publisher	Google Deep Mind	Anthropic
Release date	2026-06-20	2026-04-16
Model type	Multimodal model	Reasoning model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1M	1000K
Max output	64K	128K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Gemini 3.5 Flash	Opus 4.7
Text input	$1.5 / 1M tokens	$5 / 1M tokens
Text output	$9 / 1M tokens	$25 / 1M tokens
Cache read	Not public	$0.5 / 1M tokens
Cache write	Not public	$6.25 / 1M tokens

Summary

Gemini 3.5 Flashleads in:AI Agent - Tool Usage (3/3), Math and Reasoning (1/1)
Opus 4.7leads in:General Knowledge (3/3), Coding and Software Engineer (1/1)

On average across the 8 shared benchmarks, Opus 4.7 scores 0.36 higher.

Largest single-benchmark gap: Simple Bench — Gemini 3.5 Flash 76.70 vs Opus 4.7 61.70 (+15).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Gemini 3.5 Flash details Opus 4.7 details·Customize in compare tool