OpenAI o4 - mini

Name: OpenAI o4 - mini
Availability: InStock
Author: OpenAI

推理大模型

OpenAI o4 - mini

Release date: 2025-04-16更新于: 2025-04-19 10:57:281,263

Live demoGitHubHugging FaceCompare

Parameters

Not disclosed

Context length

200K

Chinese support

Supported

Reasoning ability

OpenAI o4 - mini is an AI model published by OpenAI, released on 2025-04-16, for 推理大模型, and 200K tokens context length, under the 不开源 license.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

OpenAI o4 - mini

Model basics

Reasoning traces

Supported

Thinking modes

Thinking modes not supported

Context length

200K tokens

Max output length

No data

Model type

推理大模型

Release date

2025-04-16

Model file size

No data

MoE architecture

Total params / Active params

No data / N/A

Knowledge cutoff

No data

OpenAI o4 - mini

Open source & experience

Code license

不开源

Weights license

不开源- 不开源

GitHub repo

GitHub link unavailable

Hugging Face

Hugging Face link unavailable

Live demo

No live demo

OpenAI o4 - mini

Official resources

Paper

Introducing OpenAI o3 and o4-mini

DataLearnerAI blog

No blog post yet

OpenAI o4 - mini

API details

API speed

3/5

💡Default unit: $/1M tokens. If vendors use other units, follow their published pricing.

Standard pricingStandard

Modality	Input	Output
Text	$1.1	$4.4
Image	$1.1	--

OpenAI o4 - mini

Benchmark Results

OpenAI o4 - mini currently shows benchmark results led by AIME 2024 (1 / 62, score 98.70), MMLU (2 / 65, score 93), AIME2025 (10 / 106, score 99.50). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

综合评估

6 evaluations

Benchmark / mode

Score

Rank/total

MMLU

Thinking Mode

2 / 65

GPQA Diamond

Thinking Mode

81.40

63 / 175

MMLU Pro

Thinking Mode

80.60

53 / 124

ARC-AGI

Thinking Mode

58.70

27 / 56

HLE

Thinking Mode

14.28

114 / 148

HLE

Thinking ModeTools

17.70

102 / 148

编程与软件工程

2 evaluations

Benchmark / mode

Score

Rank/total

CodeForces

Thinking ModeTools

2719

6 / 16

SWE-bench Verified

Thinking Mode

68.10

59 / 103

数学推理

12 evaluations

Benchmark / mode

Score

Rank/total

AIME2025

Thinking Mode

92.70

32 / 106

AIME2025

Thinking ModeTools

99.50

10 / 106

AIME 2024

Thinking Mode

93.40

5 / 62

AIME 2024

Thinking ModeTools

98.70

1 / 62

FrontierMath

Low

9.70

26 / 57

FrontierMath

Medium

19.30

15 / 57

FrontierMath

High

17.20

18 / 57

IMO-ProofBench

High

11.40

12 / 16

IMO 2024

Thinking Mode

7.70

7 / 10

FrontierMath - Tier 4

Medium

2.10

24 / 38

FrontierMath - Tier 4

High

6.30

15 / 38

IMO 2025

Thinking Mode

7 / 9

常识推理

1 evaluations

Benchmark / mode

Score

Rank/total

Simple Bench

Thinking Mode

38.70

19 / 27

Agent能力评测

3 evaluations

Benchmark / mode

Score

Rank/total

Aider-Polyglot

High

8 / 26

τ²-Bench

Thinking ModeTools

56.90

30 / 40

τ²-Bench - Telecom

Thinking ModeTools

50.20

33 / 35

View benchmark analysis Compare with other models

OpenAI o4 - mini

Publisher

OpenAI

View publisher details

OpenAI o4 - mini

Model Overview

o4 mini是OpenAI最新发布的推理大模型。

OpenAI o4-mini 是一款专注于快速、经济高效推理的小型化模型。尽管其规模较小，但它在数学、编码和视觉任务等领域展现出显著的性能。

该模型具备强大的推理能力，并能够有效地利用和组合ChatGPT内的各种工具，包括网络搜索、使用Python分析上传文件和数据、对视觉输入进行深度推理，甚至生成图像。o4-mini经过训练，能够判断何时以及如何使用这些工具来解决复杂问题，并生成详细且经过深思熟虑的答案。

在性能方面，o4-mini在多个基准测试中表现出色。例如，在AIME 2024和2025数学竞赛中，o4-mini是表现最佳的基准模型。当配合Python解释器使用时，o4-mini在AIME 2025上实现了99.5%的pass@1（8个一致性答案下达到100%的共识）。这体现了其有效利用工具的能力。专家评估指出，o4-mini不仅在数学、编码和视觉任务上表现优异，在非STEM任务以及数据科学等领域也超越了其前代模型o3-mini。同时，与前代推理模型相比，o4-mini在指令遵循、提供更有用和可验证的回复方面均有提升，交互时也表现得更为自然和对话化，能够更好地利用记忆和过往对话内容使回复更具个性化和相关性。

o4-mini在效率和成本方面也具有优势。由于其高效性，o4-mini支持比o3更高的使用限制，使其成为处理需要推理能力的高容量、高吞吐量任务的有力选择。在成本效益方面，o4-mini相较于o3-mini实现了提升，预计在多数实际应用场景中，o4-mini将比o3-mini更智能且更经济。

在安全性方面，OpenAI为o3和o4-mini重建了安全训练数据，增加了在生物风险、恶意软件生成和越狱等领域的拒绝提示。这使得o4-mini在内部拒绝基准测试中表现出色。同时，OpenAI还开发了系统级缓解措施来标记高风险领域的危险提示。根据评估结果，o4-mini在生物与化学、网络安全和AI自我改进三个追踪能力领域均低于“高”风险阈值。

用户可以通过多种途径访问o4-mini。ChatGPT Plus、Pro和Team用户可以在模型选择器中找到o4-mini和o4-mini-high，它们将替代此前的o3-mini和o3-mini-high。免费用户可以在提交查询前选择“Think”来体验o4-mini。开发者也可以通过Chat Completions API和Responses API使用o4-mini。

DataLearner 官方微信

欢迎关注 DataLearner 官方微信，获得最新 AI 技术推送

Modality

Input

Output

Text

$1.1

$4.4

Image

$1.1