Claude Opus 4.6 vs Claude Opus 4: benchmarks, pricing and specs

Claude Opus 4.6 vs Claude Opus 4: benchmarks, pricing and specs | DataLearnerAI

Benchmark scores

Grouped by capability, sorted by largest gap within each. 11 shared benchmarks.

Claude Opus 4.6 4/4

Benchmark	Claude Opus 4.6	Claude Opus 4	Diff
ARC-AGI-2	66.3014 / 58Extended (no tools)	8.6038 / 58	+57.70
ARC-AGI	9211 / 65Extended (no tools)	35.7048 / 65	+56.30
HLE	538 / 149Extended (with tools, internet)	10.70121 / 149	+42.30
GPQA Diamond	91.3112 / 175Extended (no tools)	79.6076 / 175	+11.71

Claude Opus 4.6 3/4

Benchmark	Claude Opus 4.6	Claude Opus 4	Diff
FrontierMath	40.707 / 60最高（无工具）	4.5039 / 60	+36.20
AIME2025	99.797 / 106Extended (no tools)	75.5065 / 106	+24.29
FrontierMath - Tier 4	22.9012 / 80最高（无工具）	072 / 80Normal (No Tools)	+22.90
MATH-500	97.6010 / 44Extended (no tools)

Claude Opus 4.6 2/2

Benchmark	Claude Opus 4.6	Claude Opus 4	Diff
LiveCodeBench	7635 / 118Extended (no tools)	56.6074 / 118	+19.40
SWE-bench Verified	80.846 / 103Extended (with tools)	72.5043 / 103	+8.34

Claude Opus 4.6 1/1

Benchmark	Claude Opus 4.6	Claude Opus 4	Diff
τ²-Bench	91.891 / 40Extended (with tools)	72.5022 / 40thinking + 使用工具	+19.39

Prices use DataLearner records when available; missing fields are not inferred.