- Claude Sonnet 4.6leads in:General Knowledge (3/3), AI Agent - Tool Usage (2/2), AI Agent - Information Search (1/1), Long Context (1/1), Math and Reasoning (1/1), Productivity Knowledge (1/1)
- Claude Sonnet 4.5leads in:Agent Level Benchmark (1/1), Claw-style Agent Evaluation (1/1), Coding and Software Engineer (1/1)
On average across the 12 shared benchmarks, Claude Sonnet 4.6 scores 18.09 higher.
Largest single-benchmark gap: ARC-AGI-2 — Claude Sonnet 4.6 58.30 vs Claude Sonnet 4.5 3.80 (+54.50).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.