GPT-5.4 vs Claude Opus 4.6: benchmarks, pricing and specs

GPT-5.4 vs Claude Opus 4.6: benchmarks, pricing and specs | DataLearnerAI

Benchmark scores

Grouped by capability, sorted by largest gap within each. 12 shared benchmarks.

GPT-5.4 3/5

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
ARC-AGI-2	77.107 / 58Normal (No Tools)	66.3014 / 58Extended (no tools)	+10.80
ARC-AGI	93.707 / 65Normal (No Tools)	9211 / 65Extended (no tools)	+1.70
GPQA Diamond	92.809 / 175极高强度思考（无工具）	91.3112 / 175Extended (no tools)	+1.49
HLE	52.1011 / 149极高强度思考（工具）	538 / 149Extended (with tools, internet)	-0.90
ARC-AGI-3	04 / 6Thinking High (No Tools)	01 / 6最高（无工具）	—

GPT-5.4 2/2

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
Terminal Bench 2.0	75.104 / 43极高强度思考（工具）	65.409 / 43Extended (with tools)	+9.70
OSWorld-Verified	754 / 14极高强度思考（工具）	72.706 / 14Extended (with tools)	+2.30

GPT-5.4 2/2

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
FrontierMath	47.605 / 60极高强度思考（无工具）	40.707 / 60最高（无工具）	+6.90
FrontierMath - Tier 4	27.1011 / 80极高强度思考（无工具）	22.9012 / 80最高（无工具）	+4.20

Claude Opus 4.6 1/1

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
τ²-Bench - Telecom	64.3030 / 35Normal (With Tools)	99.252 / 35Extended (with tools)	-34.95

Claude Opus 4.6 1/1

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
BrowseComp	82.709 / 43极高强度思考（工具）	846 / 43Thinking (With Tools + Internet)	-1.30

GPT-5.4 1/1

Benchmark	GPT-5.4	Claude Opus 4.6	Diff
Pinch Bench	90.501 / 37Thinking (With Tools)	87.407 / 37Thinking (With Tools)	+3.10

Prices use DataLearner records when available; missing fields are not inferred.