See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

GPT-5.4 Pro
OpenAI
Best overall
GPT-5.4 Pro · 73.78
Best single
GPT-5.4 Pro · ARC-AGI 94.50
Modality coverage
GPT-5.4 Pro · 2 modalities
Head to head
5
Benchmarks
5
Wins
0
Losses
+26.94
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
5 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | GPT-5.4 Pro | GPT-5-Pro |
|---|---|---|
ARC-AGI 综合评估 | 94.50Thinking Level · High | 70.20Thinking Enabled |
ARC-AGI-2 综合评估 | 83.30Thinking Level · High | 18.00Thinking Enabled |
GPQA Diamond 综合评估 | 94.40Thinking Level · High | 89.40Thinking Enabled | Tools |
HLE 综合评估 | 58.70Thinking Level · High | Tools | 42.00Thinking Enabled | Tools |
38.00Thinking Level · High | 14.60Thinking Enabled |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | GPT-5.4 ProOpenAI | GPT-5-ProOpenAI |
|---|---|---|
Core specsRelease | 2026-03-05 | 2025-08-07 |
Context length | 1M | 400K |
Max output | 128000 | 128000 |
MoE | No | No |
Supported modes | No mode data | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode) |
LicenseCode Open Source | Not provided | Not provided |
Weights Open Source | Not provided | Not provided |
Commercial use | 不开源 | 不开源 |
Modality supportText Input/Output | / | / |
Image Input/Output | / | / |
ResourcesPaper / report | Introducing GPT‑5.4 | Introducing GPT-5 |
DataLearner blog | Not provided | OpenAI发布GPT-5:这是一个包含实时路由的AI系统,而不仅仅是一个模型 |

GPT-5-Pro
OpenAI