See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

Claude Sonnet 4.5
Anthropic
Best overall
Claude Sonnet 4.5 · 58.87
Best single
Claude Sonnet 4.5 · MMLU Pro 88.00
Modality coverage
Claude Sonnet 4.5 · 2 modalities
Head to head
3
Benchmarks
3
Wins
0
Losses
+12.85
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
3 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Claude Sonnet 4.5 | Claude 3.5 Sonnet |
|---|---|---|
GPQA Diamond 综合评估 | 83.40Thinking Enabled | 59.40Standard Mode |
MMLU Pro 综合评估 | 88.00Thinking Enabled | 77.64Standard Mode |
FrontierMath 数学推理 | 5.20Standard Mode | 1.00Standard Mode |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | Claude Sonnet 4.5Anthropic | Claude 3.5 SonnetAnthropic |
|---|---|---|
Core specsRelease | 2025-09-30 | 2024-06-21 |
Context length | 1000K | 200K |
Max output | 65536 | Not provided |
MoE | No | No |
Supported modes | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode)深度思考(Deeper Thinking Mode) | No mode data |
LicenseCode Open Source | Not provided | Not provided |
Weights Open Source | Not provided | Not provided |
Commercial use | 不开源 | 不开源 |
Modality supportText Input/Output | / | Not provided |
Image Input/Output | / | Not provided |
ResourcesPaper / report | Introducing Claude Sonnet 4.5 | Claude 3.5 Sonnet |
DataLearner blog | 全球最强编程大模型升级:Anthropic发布Claude Sonnet 4.5!同时还有一波重磅工具更新:Claude Code支持保存状态等 | Anthropic发布Claude3.5-Sonnet模型,超过Claude3系列所有模型的能力,并且支持多模态! |