Haiku 4.5vsClaude 3.5 Haiku
Across 3 shared benchmarks, Haiku 4.5 leads overall: Haiku 4.5 wins 3, Claude 3.5 Haiku wins 0, with 0 ties and an average score difference of +11.23.
Haiku 4.5
Anthropic · 2025-10-15 · Multimodal model
Claude 3.5 Haiku
Anthropic · 2024-10-22 · Foundation model
Haiku 4.53 wins(100%)(0%)0 winsClaude 3.5 Haiku
Benchmark scores
Grouped by capability, sorted by largest gap within each. 3 shared benchmarks.
General Knowledge
Haiku 4.5 2/2| Benchmark | Haiku 4.5 | Claude 3.5 Haiku | Diff |
|---|---|---|---|
| GPQA Diamond | 60.50138 / 178Normal (No Tools) | 41.60162 / 178 | +18.90 |
| MMLU Pro | 7678 / 126Normal (No Tools) | 65101 / 126 | +11 |
Math and Reasoning
Haiku 4.5 1/1| Benchmark | Haiku 4.5 | Claude 3.5 Haiku | Diff |
|---|---|---|---|
| FrontierMath | 4.1041 / 60Normal (No Tools) | 0.3057 / 60 | +3.80 |
Specs
| Field | Haiku 4.5 | Claude 3.5 Haiku |
|---|---|---|
| Publisher | Anthropic | Anthropic |
| Release date | 2025-10-15 | 2024-10-22 |
| Model type | Multimodal model | Foundation model |
| Architecture | Dense | Dense |
| Parameters | Not available | Not available |
| Context length | 200K | 200K |
| Max output | 64K | Not available |
Summary
- Haiku 4.5leads in:General Knowledge (2/2), Math and Reasoning (1/1)
On average across the 3 shared benchmarks, Haiku 4.5 scores 11.23 higher.
Largest single-benchmark gap: GPQA Diamond — Haiku 4.5 60.50 vs Claude 3.5 Haiku 41.60 (+18.90).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.