Claude Sonnet 4.6vsGPT-5.2

Claude Sonnet 4.6 and GPT-5.2 are tied across 10 shared benchmarks: Claude Sonnet 4.6 leads on 5, GPT-5.2 leads on 5, with 0 ties and an average score difference of +1.61.

Claude Sonnet 4.6

Anthropic · 2026-02-17 · Chat model

GPT-5.2

OpenAI · 2025-12-11 · Chat model

Claude Sonnet 4.65 wins(50%)(50%)5 winsGPT-5.2

Benchmark scores

Grouped by capability, sorted by largest gap within each. 10 shared benchmarks.

General Knowledge

Claude Sonnet 4.6 3/4

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
LiveBench	75.4712 / 115Thinking Medium (No Tools)	48.9194 / 115Normal (No Tools)	+26.56
ARC-AGI-2	58.3021 / 62	54.2023 / 62深度思考（无工具、并行）	+4.10
HLE	4932 / 172	45.5041 / 172Deep Thinking (With Tools + Internet)	+3.50
GPQA Diamond	89.9024 / 187	93.209 / 187深度思考（无工具、并行）	-3.30

Agent Level Benchmark

GPT-5.2 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
τ²-Bench - Telecom	97.909 / 35	98.704 / 35极高强度思考（工具）	-0.80

AI Agent - Information Search

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
BrowseComp	74.7027 / 53	65.8031 / 53Deep Thinking (With Tools + Internet)	+8.90

AI Agent - Tool Usage

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
MCP-Atlas	69.5017 / 27Normal (With Tools)	67.6018 / 27极高强度思考（工具）	+1.90

Coding and Software Engineer

GPT-5.2 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
SWE-bench Verified	79.6018 / 112	8017 / 112极高强度思考（工具）	-0.40

Math and Reasoning

GPT-5.2 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
FrontierMath - Tier 4	8.3034 / 80Thinking (No Tools, 16K Budget)	18.8016 / 80Thinking High (No Tools)	-10.50

Productivity Knowledge

GPT-5.2 1/1

Benchmark	Claude Sonnet 4.6	GPT-5.2	Diff
GDPval-AA	5711 / 21	70.909 / 21Thinking High (With Tools)	-13.90

Specs

Field	Claude Sonnet 4.6	GPT-5.2
Publisher	Anthropic	OpenAI
Release date	2026-02-17	2025-12-11
Model type	Chat model	Chat model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1M	400K
Max output	8K	Not available

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Claude Sonnet 4.6	GPT-5.2
Text input	$3 / 1M tokens	$1.75 / 1M tokens
Text output	$15 / 1M tokens	$14 / 1M tokens
Cache read	$0.3 / 1M tokens	$0.175 / 1M tokens
Cache write	$3.75 / 1M tokens	$1.75 / 1M tokens

Summary

Claude Sonnet 4.6leads in:General Knowledge (3/4), AI Agent - Information Search (1/1), AI Agent - Tool Usage (1/1)
GPT-5.2leads in:Agent Level Benchmark (1/1), Coding and Software Engineer (1/1), Math and Reasoning (1/1), Productivity Knowledge (1/1)

On average across the 10 shared benchmarks, Claude Sonnet 4.6 scores 1.61 higher.

Largest single-benchmark gap: LiveBench — Claude Sonnet 4.6 75.47 vs GPT-5.2 48.91 (+26.56).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Claude Sonnet 4.6 details GPT-5.2 details·Customize in compare tool