Claude Sonnet 4.5vsClaude 3.5 Sonnet New

Across 6 shared benchmarks, Claude Sonnet 4.5 leads overall: Claude Sonnet 4.5 wins 6, Claude 3.5 Sonnet New wins 0, with 0 ties and an average score difference of +16.48.

Claude Sonnet 4.5

Anthropic · 2025-09-30 · Chat model

Claude 3.5 Sonnet New

Anthropic · 2024-10-22 · Chat model

Claude Sonnet 4.56 wins(100%)(0%)0 winsClaude 3.5 Sonnet New

Benchmark scores

Grouped by capability, sorted by largest gap within each. 6 shared benchmarks.

Coding and Software Engineer

Claude Sonnet 4.5 2/2

Benchmark	Claude Sonnet 4.5	Claude 3.5 Sonnet New	Diff
SWE-bench Verified	826 / 108	4993 / 108	+33
LiveCodeBench	7147 / 120	38.70102 / 120	+32.30

General Knowledge

Claude Sonnet 4.5 2/2

Benchmark	Claude Sonnet 4.5	Claude 3.5 Sonnet New	Diff
GPQA Diamond	83.4058 / 178	65131 / 178	+18.40
MMLU Pro	887 / 126	7869 / 126	+10

Math and Reasoning

Claude Sonnet 4.5 2/2

Benchmark	Claude Sonnet 4.5	Claude 3.5 Sonnet New	Diff
FrontierMath	5.2038 / 60	2.1047 / 60	+3.10
FrontierMath - Tier 4	2.1056 / 80Normal (No Tools)	072 / 80Normal (No Tools)	+2.10

Specs

Field	Claude Sonnet 4.5	Claude 3.5 Sonnet New
Publisher	Anthropic	Anthropic
Release date	2025-09-30	2024-10-22
Model type	Chat model	Chat model
Architecture	Dense	Dense
Parameters	Not available	Not available
Context length	1000K	200K
Max output	64K	Not available

Summary

Claude Sonnet 4.5leads in:Coding and Software Engineer (2/2), General Knowledge (2/2), Math and Reasoning (2/2)

On average across the 6 shared benchmarks, Claude Sonnet 4.5 scores 16.48 higher.

Largest single-benchmark gap: SWE-bench Verified — Claude Sonnet 4.5 82 vs Claude 3.5 Sonnet New 49 (+33).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

Claude Sonnet 4.5 details Claude 3.5 Sonnet New details·Customize in compare tool