Claude Sonnet 4.5vsClaude 3.5 Sonnet

Across 4 shared benchmarks, Claude Sonnet 4.5 leads overall: Claude Sonnet 4.5 wins 4, Claude 3.5 Sonnet wins 0, with 0 ties and an average score difference of +10.17.

Claude Sonnet 4.5

Anthropic · 2025-09-30 · Chat model

Claude 3.5 Sonnet

Anthropic · 2024-06-21 · Multimodal model

Claude Sonnet 4.54 wins(100%)(0%)0 winsClaude 3.5 Sonnet

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

General Knowledge

Claude Sonnet 4.5 2/2

Benchmark	Claude Sonnet 4.5	Claude 3.5 Sonnet	Diff
GPQA Diamond	83.4058 / 178	59.40141 / 178	+24
MMLU Pro	887 / 126	77.6474 / 126	+10.36

Math and Reasoning

Claude Sonnet 4.5 2/2

Benchmark	Claude Sonnet 4.5	Claude 3.5 Sonnet	Diff
FrontierMath	5.2038 / 60	152 / 60	+4.20
FrontierMath - Tier 4	2.1056 / 80Normal (No Tools)	072 / 80Normal (No Tools)	+2.10

Specs

Field	Claude Sonnet 4.5	Claude 3.5 Sonnet
Publisher	Anthropic	Anthropic
Release date	2025-09-30	2024-06-21
Model type	Chat model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1000K	200K
Max output	64K	Not available

Summary

Claude Sonnet 4.5leads in:General Knowledge (2/2), Math and Reasoning (2/2)

On average across the 4 shared benchmarks, Claude Sonnet 4.5 scores 10.17 higher.

Largest single-benchmark gap: GPQA Diamond — Claude Sonnet 4.5 83.40 vs Claude 3.5 Sonnet 59.40 (+24).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Claude Sonnet 4.5 details Claude 3.5 Sonnet details·Customize in compare tool