Benchmark Results
Benchmark Results
综合评估
3 evaluationsAI Agent - 工具使用
3 evaluationsCompetitor Comparison
Benchmark scores for GPT-5.4 mini compared against top models in its class
6 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | GPT-5.4 miniCurrent | Haiku 4.5 | Gemini 3.0 Flash |
|---|---|---|---|
GPQA Diamond 综合评估 | 88.00Thinking Level · Extra High | 73.30Extended Thinking | 90.40Thinking Enabled |
HLE 综合评估 | 41.50Thinking Level · Extra High | Tools | 9.70Extended Thinking | 43.50Thinking Enabled | Tools |
2.10Thinking Level · High | 2.1032K | 4.20Standard Mode | |
SWE-Bench Pro - Public 编程与软件工程 | 54.40Thinking Level · Extra High | Tools | 39.45Extended Thinking | Tools | -- |
Terminal Bench 2.0 AI Agent - 工具使用 | 60.00Thinking Level · Extra High | Tools | -- | 47.60Thinking Enabled | Tools |
Claw Bench OpenClaw智能体能力综合测评 | 75.30Thinking Enabled | Tools | 89.40Thinking Enabled | Tools | 85.70Thinking Enabled | Tools |
Standard API Pricing: GPT-5.4 mini vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
GPT-5.4 mini | OpenAI | $0.75 / 1M tokens | $4.5 / 1M tokens | — |
Haiku 4.5 | — | 1 美元 / 100万 tokens | 5 美元 / 100万 tokens | — |
Gemini 3.0 Flash | — | 0.5 美元/100万 tokens | 3 美元/100万 tokens | — |
Version History
How each version of the GPT-5.4 mini series stacks up on benchmark tests
3 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | GPT-5.4 miniCurrent | GPT-5-mini |
|---|---|---|
GPQA Diamond 综合评估 | 88.00Thinking Level · Extra High | 69.00Thinking Enabled |
HLE 综合评估 | 41.50Thinking Level · Extra High | Tools | 5.00Thinking Enabled |
2.10Thinking Level · High | 6.30Thinking Level · High |
Single-Benchmark Version Trend
Viewing: GPQA Diamond · 综合评估
Standard API Pricing Across the GPT-5.4 mini Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
GPT-5.4 mini | OpenAI | $0.75 / 1M tokens | $4.5 / 1M tokens | — |
GPT-5-mini | — | 0.25 美元/100 万tokens | 2 美元/100 万tokens | — |