GP

GPT-5.4

Multimodal modelGPTGPT-5.4

GPT-5.4

Release date: 2026-03-05Updated: 2026-06-15 07:18:17.137Knowledge cutoff: 2025-083,304
Live demoGitHubHugging FaceCompare
Parameters
Not disclosed
Context length
1M
Chinese support
Supported
Reasoning ability

GPT-5.4 是 OpenAI 于 2026 年 3 月发布的多模态大型语言模型,属于 GPT-5 系列迭代版本。该模型面向复杂知识工作、软件工程辅助与长上下文分析场景,支持最高 1M tokens 的超长上下文窗口,并提供思考(Thinking)与 Pro 多个配置变体。在主要评测基准中,GPT-5.4 在 SWE-Bench Pro(57.70,排名第 1)、GPQA Diamond(92.80)、OSWorld-Verified(75.0,排名第 1)以及 FrontierMath(47.60)等方向具备竞争力。API 标准输入定价为 $2.50/1M tokens(272K 上下文以内),输出为 $15.00/1M tokens,通过 OpenAI API 及 ChatGPT 平台访问,模型权重不开源。

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

GPT-5.4

Model basics

Reasoning traces
Supported
Thinking modes
Thinking Level · Extra-High (Default)Thinking Level · LowThinking Level · MediumThinking Level · High
Context length
1M tokens
Max output length
125K tokens
Model type
Multimodal model
Modality (in / out)
Text, Image → Text
Release date
2026-03-05
Model file size
No data
MoE architecture
No
Total params / Active params
No data / N/A
Knowledge cutoff
2025-08
GPT-5.4

Open source & experience

Code license
不开源
Weights license
不开源
GitHub repo
GitHub link unavailable
Hugging Face
Hugging Face link unavailable
GPT-5.4

Official resources

DataLearnerAI blog
No blog post yet
GPT-5.4

API details

API speed
3/5
💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.
Standard
TypeConditionInputOutput
TextContext <= 272K$2.50/ 1M$15.00/ 1M
TextContext > 272K$5.00/ 1M$22.50/ 1M
Cache PricingPrompt Cache
TypeTTLWriteRead
Text5m$0.250/ 1M-
GPT-5.4

Benchmark Results

GPT-5.4 currently shows benchmark results led by LiveBench (2 / 115, score 80.28), Pinch Bench (1 / 37, score 90.50), GPQA Diamond (10 / 179, score 92.80). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

Thinking
Tool usage

General Knowledge

14 evaluations
Benchmark / mode
Score
Rank/total
ARC-AGI
Standard Mode
93.70
7 / 65
68.20
28 / 65
ARC-AGI
Medium
86.20
18 / 65
ARC-AGI
Extra-High
93.70
7 / 65
GPQA Diamond
Extra-High
92.80
10 / 179
75.07
16 / 115
80.28
2 / 115
ARC-AGI-2
Standard Mode
77.10
7 / 59
29.20
30 / 59
ARC-AGI-2
Medium
55.40
19 / 59
ARC-AGI-2
Extra-High
74
10 / 59
HLE
Extra-High
39.80
54 / 159
HLE
Extra-HighTools
52.10
15 / 159
0
4 / 6

Math and Reasoning

2 evaluations
Benchmark / mode
Score
Rank/total
FrontierMath
Extra-High
47.60
5 / 60
27.10
11 / 80

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
57.70
11 / 44
DeepSWE
Extra-HighTools
52
4 / 9

Agent Level Benchmark

2 evaluations
Benchmark / mode
Score
Rank/total
τ²-Bench - Telecom
Standard ModeTools
64.30
30 / 35
τ²-Bench - Telecom
Extra-HighTools
98.90
3 / 35

AI Agent - Information Search

1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
Extra-HighTools
82.70
11 / 45

AI Agent - Tool Usage

3 evaluations
Benchmark / mode
Score
Rank/total
Terminal Bench 2.0
Extra-HighTools
75.10
4 / 46
OSWorld-Verified
Extra-HighTools
75
7 / 18
MCP-Atlas
Extra-HighTools
70.60
10 / 23

Claw-style Agent Evaluation

2 evaluations
Benchmark / mode
Score
Rank/total
Claw Bench
Thinking ModeTools
92.70
3 / 29
Pinch Bench
Thinking ModeTools
90.50
1 / 37

Compare with other models

GPT-5.4

Publisher

GPT-5.4

Model Overview

GPT-5.4 是 OpenAI 于 2026 年 3 月发布的多模态大型语言模型,属于 GPT-5 系列迭代版本。该模型面向复杂知识工作、软件工程辅助与长上下文分析场景,支持最高 1M tokens 的超长上下文窗口,并提供思考(Thinking)与 Pro 多个配置变体。在主要评测基准中,GPT-5.4 在 SWE-Bench Pro(57.70,排名第 1)、GPQA Diamond(92.80)、OSWorld-Verified(75.0,排名第 1)以及 FrontierMath(47.60)等方向具备竞争力。API 标准输入定价为 $2.50/1M tokens(272K 上下文以内),输出为 $15.00/1M tokens,通过 OpenAI API 及 ChatGPT 平台访问,模型权重不开源。

DataLearner on WeChat

Follow DataLearner on WeChat for AI model updates and research notes.

DataLearner WeChat QR code