Opus 4.7vsOpus 4.1

Across 4 shared benchmarks, Opus 4.7 leads overall: Opus 4.7 wins 4, Opus 4.1 wins 0, with 0 ties and an average score difference of +20.72.

Anthropic · 2026-04-16 · Reasoning model

Anthropic · 2025-08-06 · Reasoning model

Opus 4.74 wins(100%)(0%)0 winsOpus 4.1

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

Opus 4.7 2/2

Benchmark	Opus 4.7	Opus 4.1	Diff
FrontierMath	43.806 / 60极高强度思考（无工具）	5.9035 / 60Normal (No Tools)	+37.90
FrontierMath - Tier 4	22.9012 / 80极高强度思考（无工具）	4.2040 / 80Thinking (No Tools, 32K Budget)	+18.70

Opus 4.7 1/1

Benchmark	Opus 4.7	Opus 4.1	Diff
SWE-bench Verified	87.605 / 108Extended (with tools)	74.5036 / 108Extended (with tools)	+13.10

Opus 4.7 1/1

Benchmark	Opus 4.7	Opus 4.1	Diff
GPQA Diamond	94.204 / 178Extended (no tools)	8169 / 178Extended (no tools)	+13.20

Prices use DataLearner records when available; missing fields are not inferred.

Opus 4.7leads in:Math and Reasoning (2/2), Coding and Software Engineer (1/1), General Knowledge (1/1)

On average across the 4 shared benchmarks, Opus 4.7 scores 20.72 higher.

Largest single-benchmark gap: FrontierMath — Opus 4.7 43.80 vs Opus 4.1 5.90 (+37.90).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.