Claude Sonnet 4.6vsClaude Sonnet 4.5

Across 14 shared benchmarks, Claude Sonnet 4.6 leads overall: Claude Sonnet 4.6 wins 11, Claude Sonnet 4.5 wins 3, with 0 ties and an average score difference of +14.49.

Claude Sonnet 4.6

Anthropic · 2026-02-17 · Chat model

Claude Sonnet 4.5

Anthropic · 2025-09-30 · Chat model

Claude Sonnet 4.611 wins(79%)(21%)3 winsClaude Sonnet 4.5

Benchmark scores

Grouped by capability, sorted by largest gap within each. 14 shared benchmarks.

General Knowledge

Claude Sonnet 4.6 4/4

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
ARC-AGI-2	58.3021 / 62	13.6038 / 62	+44.70
LiveBench	75.4712 / 115Thinking Medium (No Tools)	53.6983 / 115Normal (No Tools)	+21.78
HLE	4932 / 172	33.6080 / 172	+15.40
GPQA Diamond	89.9024 / 187	83.4063 / 187	+6.50

AI Agent - Tool Usage

Claude Sonnet 4.6 3/3

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
Terminal Bench 2.0	59.1022 / 47	42.8042 / 47	+16.30
OSWorld-Verified	72.5016 / 24	61.4020 / 24	+11.10
MCP-Atlas	69.5017 / 27Normal (With Tools)	59.5021 / 27Thinking (With Tools)	+10

Agent Level Benchmark

Claude Sonnet 4.5 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
τ²-Bench - Telecom	97.909 / 35	985 / 35	-0.10

AI Agent - Information Search

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
BrowseComp	74.7027 / 53	24.1051 / 53	+50.60

Claw-style Agent Evaluation

Claude Sonnet 4.5 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
Pinch Bench	885 / 37Thinking (With Tools)	88.204 / 37Thinking (With Tools)	-0.20

Coding and Software Engineer

Claude Sonnet 4.5 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
SWE-bench Verified	79.6018 / 112	828 / 112	-2.40

Long Context

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
AA-LCR	713 / 15	6610 / 15	+5

Math and Reasoning

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
FrontierMath - Tier 4	8.3034 / 80Thinking (No Tools, 16K Budget)	2.1056 / 80Normal (No Tools)	+6.20

Productivity Knowledge

Claude Sonnet 4.6 1/1

Benchmark	Claude Sonnet 4.6	Claude Sonnet 4.5	Diff
GDPval-AA	5711 / 21	3916 / 21	+18

Specs

Field	Claude Sonnet 4.6	Claude Sonnet 4.5
Publisher	Anthropic	Anthropic
Release date	2026-02-17	2025-09-30
Model type	Chat model	Chat model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1M	1000K
Max output	8K	64K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	Claude Sonnet 4.6	Claude Sonnet 4.5
Text input	$3 / 1M tokens	$3 / 1M tokens
Text output	$15 / 1M tokens	$15 / 1M tokens
Cache read	$0.3 / 1M tokens	$0.3 / 1M tokens
Cache write	$3.75 / 1M tokens	$3.75 / 1M tokens

Summary

Claude Sonnet 4.6leads in:General Knowledge (4/4), AI Agent - Tool Usage (3/3), AI Agent - Information Search (1/1), Long Context (1/1), Math and Reasoning (1/1), Productivity Knowledge (1/1)
Claude Sonnet 4.5leads in:Agent Level Benchmark (1/1), Claw-style Agent Evaluation (1/1), Coding and Software Engineer (1/1)

On average across the 14 shared benchmarks, Claude Sonnet 4.6 scores 14.49 higher.

Largest single-benchmark gap: BrowseComp — Claude Sonnet 4.6 74.70 vs Claude Sonnet 4.5 24.10 (+50.60).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Claude Sonnet 4.6 details Claude Sonnet 4.5 details·Customize in compare tool