Haiku 4.5 Benchmark Details
Haiku 4.5 currently shows benchmark results led by AIME2025 (20 / 107, score 96.30), LiveBench (16 / 51, score 71.38), Terminal-Bench (11 / 35, score 41). This page also compares it with 2 competitor models and 2 predecessor or same-series models, including performance and pricing views when available.
Benchmark Results
Benchmark Results
Competitor Comparison
Benchmark scores for Haiku 4.5 compared against top models in its class
Benchmark Score Comparison
9 benchmarks with comparable scores
| Benchmark | Haiku 4.5(This model) | GPT-5.4 mini | Gemini 3.0 Flash |
|---|---|---|---|
ARC-AGI-2 综合评估 | 4.50 扩展(无工具) | -- | 33.60 thinking |
GPQA Diamond 综合评估 | 73.30 扩展(无工具) | 88.00 极高强度思考(无工具) | 90.40 thinking |
HLE 综合评估 | 9.70 扩展(无工具) | 41.50 极高强度思考(工具) | 43.50 thinking + 使用工具 |
SWE-Bench Pro - Public 编程与软件工程 | 39.45 扩展(工具) | 54.40 极高强度思考(工具) | -- |
SWE-bench Verified 编程与软件工程 | 73.30 思考模式(工具,128K预算) | -- | 68.70 thinking |
AIME2025 数学推理 | 96.30 思考模式(工具,128K预算) | -- | 99.70 thinking + 使用工具 |
τ²-Bench Agent能力评测 | 33.00 常规模式(工具) | -- | 90.20 thinking + 使用工具 |
Claw Bench OpenClaw智能体能力综合测评 | 89.40 思考模式(工具) | 75.30 思考模式(工具) | 85.70 思考模式(工具) |
Pinch Bench OpenClaw智能体能力综合测评 | 82.00 思考模式(工具) | -- | 85.20 思考模式(工具) |
Standard API Pricing: Haiku 4.5 vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Haiku 4.5 Current model | — | 1 美元 / 100万 tokens | 5 美元 / 100万 tokens | — |
GPT-5.4 mini | OpenAI | $0.75 / 1M tokens | $4.5 / 1M tokens | — |
Gemini 3.0 Flash | — | 0.5 美元/100万 tokens | 3 美元/100万 tokens | — |
Version History
How each version of the Haiku 4.5 series stacks up on benchmark tests
Benchmark Score Comparison
3 benchmarks with comparable scores
| Benchmark | Haiku 4.5(This model) | Claude 3.5 Haiku |
|---|---|---|
GPQA Diamond 综合评估 | 73.30 扩展(无工具) | 41.60 normal |
MMLU Pro 综合评估 | 80.00 扩展(无工具) | 65.00 normal |
FrontierMath 数学推理 | 4.10 常规模式(无工具) | 0.30 normal |
Standard API Pricing Across the Haiku 4.5 Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
These models use different currencies or billing units, so the page falls back to raw price values instead of a shared bar chart.
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Haiku 4.5 Current model | — | 1 美元 / 100万 tokens | 5 美元 / 100万 tokens | — |
Series Overview
See how each version of the Haiku 4.5 series performs across major benchmarks. Click any row to break down scores by reasoning mode.
Tip: click any score cell to switch the chart below.
| Benchmark | Claude 3.5 Haiku10/22/2024 | Haiku 4.510/15/2025 |
|---|---|---|
Single-Benchmark Mode Relation
Viewing: GPQA Diamond · 综合评估