Claude Opus 4.6 vs Opus 4.5: benchmarks, pricing and specs

Claude Opus 4.6 vs Opus 4.5: benchmarks, pricing and specs | DataLearnerAI

Benchmark scores

Grouped by capability, sorted by largest gap within each. 14 shared benchmarks.

Claude Opus 4.6 4/4

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
ARC-AGI-2	66.3014 / 58Extended (no tools)	37.6025 / 58Extended (no tools)	+28.70
ARC-AGI	9211 / 65Extended (no tools)	8021 / 65Extended (no tools)	+12
HLE	538 / 149Extended (with tools, internet)	43.2034 / 149Extended (with tools)	+9.80
GPQA Diamond	91.3112 / 175Extended (no tools)	8735 / 175Extended (no tools)	+4.31

Claude Opus 4.6 2/2

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
τ²-Bench	91.891 / 40Extended (with tools)	81.9913 / 40Extended (with tools)	+9.90
τ²-Bench - Telecom	99.252 / 35Extended (with tools)	90.7021 / 35Extended (with tools)	+8.55

Opus 4.5 2/2

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
LiveCodeBench	7635 / 118Extended (no tools)	8710 / 118Extended (with tools)	-11
SWE-bench Verified	80.846 / 103Extended (with tools)	80.905 / 103Extended (with tools)	-0.06

Claude Opus 4.6 2/2

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
FrontierMath	40.707 / 60最高（无工具）	20.7017 / 60Extended (no tools)	+20
FrontierMath - Tier 4	22.9012 / 80最高（无工具）	4.2040 / 80Normal (No Tools)	+18.70

Claude Opus 4.6 1/1

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
Terminal Bench 2.0	65.409 / 43Extended (with tools)	59.3017 / 43Extended (with tools)	+6.10

Claude Opus 4.6 1/1

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
Pinch Bench	87.407 / 37Thinking (With Tools)	87.208 / 37Extended (with tools)	+0.20

Claude Opus 4.6 1/1

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
IF Bench	941 / 27Extended (no tools)	5818 / 27Extended (with tools)	+36

Opus 4.5 1/1

Benchmark	Claude Opus 4.6	Opus 4.5	Diff
MMMU	77.3015 / 28Extended (with tools)	80.7010 / 28Extended (no tools)	-3.40

Prices use DataLearner records when available; missing fields are not inferred.