加载中...
加载中...
GPT-5.4 currently shows benchmark results led by HLE (3 / 113, score 52.10), GPQA Diamond (6 / 160, score 92.80), SWE-Bench Pro - Public (1 / 19, score 57.70). This page also compares it with 2 competitor models and 2 predecessor or same-series models, including performance and pricing views when available. 1 source link is attached for reference.
Side-by-side benchmark comparison of GPT-5.4 against leading peer models
Horizontal view (auto for dense data)
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
Default supplier standard price · Billing unit: USD / 1M tokens · Current model: bold label
When a context threshold exists, the charted base price only applies within these limits:
Track the evolution of the GPT-5.4 series across generations
Horizontal view (auto for dense data)
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
Default supplier standard price · Billing unit: USD / 1M tokens · Current model: bold label
When a context threshold exists, the charted base price only applies within these limits:
Top: multi-benchmark panorama. Bottom: single-benchmark mode relation with dotted links inside each generation.
Tip: click any score cell to switch the chart below.
Note: X-axis is model (with release date), Y-axis is score. Dotted lines connect modes within the same generation.