GPT-5.4 minivsHaiku 4.5
Across 5 shared benchmarks, GPT-5.4 mini leads overall: GPT-5.4 mini wins 3, Haiku 4.5 wins 1, with 1 ties and an average score difference of +13.11.
GPT-5.4 mini
OpenAI · 2026-03-17 · Reasoning model
Haiku 4.5
Anthropic · 2025-10-15 · Multimodal model
GPT-5.4 mini3 wins(60%)Ties1(20%)1 winHaiku 4.5
Benchmark scores
Grouped by capability, sorted by largest gap within each. 5 shared benchmarks.
General Knowledge
GPT-5.4 mini 2/2| Benchmark | GPT-5.4 mini | Haiku 4.5 | Diff |
|---|---|---|---|
| HLE | 41.5046 / 157极高强度思考(工具) | 4.30155 / 157Normal (No Tools) | +37.20 |
| GPQA Diamond | 8832 / 178极高强度思考(无工具) | 60.50138 / 178Normal (No Tools) | +27.50 |
Claw-style Agent Evaluation
Haiku 4.5 1/1| Benchmark | GPT-5.4 mini | Haiku 4.5 | Diff |
|---|---|---|---|
| Claw Bench | 75.3025 / 29Thinking (With Tools) | 89.4011 / 29Thinking (With Tools) | -14.10 |
Coding and Software Engineer
GPT-5.4 mini 1/1| Benchmark | GPT-5.4 mini | Haiku 4.5 | Diff |
|---|---|---|---|
| SWE-Bench Pro - Public | 54.4021 / 43极高强度思考(工具) | 39.4540 / 43Extended (with tools) | +14.95 |
Math and Reasoning
Even 1/1| Benchmark | GPT-5.4 mini | Haiku 4.5 | Diff |
|---|---|---|---|
| FrontierMath - Tier 4 | 2.1056 / 80Thinking High (No Tools) | 2.1056 / 80Thinking (No Tools, 32K Budget) | — |
Specs
| Field | GPT-5.4 mini | Haiku 4.5 |
|---|---|---|
| Publisher | OpenAI | Anthropic |
| Release date | 2026-03-17 | 2025-10-15 |
| Model type | Reasoning model | Multimodal model |
| Architecture | Dense | Dense |
| Parameters | Not available | Not available |
| Context length | 400K | 200K |
| Max output | 128K | 64K |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | GPT-5.4 mini | Haiku 4.5 |
|---|---|---|
| Text input | $0.75 / 1M tokens | Not public |
| Text output | $4.5 / 1M tokens | Not public |
| Cache read | $4.5 / 1M tokens | Not public |
| Cache write | $0.075 / 1M tokens | Not public |
One or both models have incomplete public pricing.
Summary
- GPT-5.4 minileads in:General Knowledge (2/2), Coding and Software Engineer (1/1)
- Haiku 4.5leads in:Claw-style Agent Evaluation (1/1)
- Tied in:Math and Reasoning
On average across the 5 shared benchmarks, GPT-5.4 mini scores 13.11 higher.
Largest single-benchmark gap: HLE — GPT-5.4 mini 41.50 vs Haiku 4.5 4.30 (+37.20).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.