加载中...

Comparing Grok 4 Fast, Grok 4, GPT-5 (+1 more) - LLM benchmark results | DataLearnerAI

大模型评测对比结果

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 4 个模型的评测数据与核心参数。

Grok 4 FastGrok 4GPT-5Gemini 2.5-Pro

规格对比

xAI

Grok 4 Fast

Release2025-09-19

Context length2000K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

Model profile Playground

xAI

Grok 4

Release2025-07-10

Context length256K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

OpenAI

GPT-5

Release2025-08-07

Context length400K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

Google Deep Mind

Gemini 2.5-Pro

Release2025-06-05

Context length1000K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

All Modes

Shortcuts

Thinking Mode (Default)

Thinking Mode (Default) - Help

Default: Thinking Mode (Default) (Standard/Medium)
All: Thinking Mode (All)

All Tools & Parallel

Best Overall

Grok 4 · 75.85

Best Single

GPT-5 · AIME2025 99.60

Thinking Mode (Default)

GPT-5 · 4 All Modes

Benchmark scores

Higher is usually better; “—” means no score.

Filter: All Modes14 All Modes · 6 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

Benchmark scores

Higher is usually better; “—” means no score.

6 Benchmark14 All Modes

Supported modes:NormalThinkDeepToolParallel

Benchmark	GR Grok 4 FastxAI			GR Grok 4xAI				GP GPT-5OpenAI					GE Gemini 2.5-ProGoogle Deep Mind
Benchmark
综合评估
GPQA Diamond	—	85.70	—	—	87.00	—	—	77.80	85.70	—	—	87.30	—	86.40
HLE	—	20.00	—	—	25.40	38.60	38.60	6.30	—	—	24.80	35.20	—	21.60
LiveBench	68.09	—	—	72.84	—	—	—	—	79.33	78.85	—	—	—	71.92
常识问答
SimpleQA	—	—	95.00	—	—	—	—	—	—	—	—	—	54.00	—
编程与软件工程
LiveCodeBench	—	80.00	—	—	82.00	—	—	—	—	—	—	—	77.10	—
数学推理
AIME2025	—	92.00	—	—	91.70	98.80	—	61.90	—	—	94.60	99.60	—	88.00

Feature compare

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	GR Grok 4 FastxAI	GR Grok 4xAI	GP GPT-5OpenAI	GE Gemini 2.5-ProGoogle Deep Mind
Model snapshots
Organization	xAI	xAI	OpenAI	Google Deep Mind
模型全名	Grok 4 Fast	Grok 4	GPT-5	Gemini 2.5-Pro
模型简介	Not provided	Not provided	Not provided	Not provided
模型类型	聊天大模型	推理大模型	基础大模型	推理大模型
模型代号	Grok-4-Fast	grok-4	gpt-5	gemini-2_5-pro-preview-06-05
Release	2025-09-19	2025-07-10	2025-08-07	2025-06-05
MoE	No	No	No	No
规格与性能
Context length	2000K	256K	400K	1000K
Parameters	—	—	—	—
激活参数量	Not provided	Not provided	Not provided	Not provided
模型规模	未知	未知	未知	未知
模型大小	Not provided	Not provided	Not provided	Not provided
推理速度
推理等级
最大输出	4096	262144	131072	65536
Supported modes	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）
开源与许可
Code Open Source	Not provided	Not provided	Not provided	Not provided
Weights Open Source	Not provided	Not provided	Not provided	Not provided
Commercial use	不开源	不开源	不开源	不开源
Modality support
Text Input/Output	/	/	/	/
Image Input/Output	/	/	/	/
Audio Input/Output	/	/	/	/
Video Input/Output	/	/	/	/
Embedding Input/Output	/	/	/	/
API 接口详情
Text 价格	Input: 0.2 美元/100万 tokensOutput: 0.5 美元/100万 tokens	Input: 3 美元/100 万tokensOutput: 15 美元/100 万tokens	Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokens	Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokensCache: 0.125 美元/100 万tokensInput (Extended): 2.5 美元/100 万tokensOutput (Extended): 15 美元/100 万tokensThreshold: 200K
Image API pricing	Input: 0.2 美元/100万 tokens	Input: 3 美元/100 万tokens	Not provided	Input: 1.25 美元/100 万tokensCache: 0.125 美元/100 万tokens
Audio API pricing	Not provided	Not provided	Not provided	Not provided
Video API pricing	Not provided	Not provided	Not provided	Not provided
Embedding API pricing	Not provided	Not provided	Not provided	Not provided
Resources
GitHub	Not provided	Not provided	Not provided	Not provided
Hugging Face	Not provided	Not provided	Not provided	Not provided
Official Page	Not provided	Not provided	Not provided	Not provided
Guides	Not provided	Not provided	Not provided	Not provided
Papers	Grok 4 Fast Pushing the Frontier of Cost-Efficient Intelligence	Grok 4	Introducing GPT-5	Try the latest Gemini 2.5 Pro before general availability.
DataLearnerAI	大模型速度、效果与价格的完美结合？xAI发布Grok 4 Fast：性能接近Grok 4，成本降 98%，生成速度翻倍！	AIME 2025满分，xAI正式发布Grok模型，其中Grok 4 Heavy评测超越当前所有大模型，美国数学竞赛满分！一年3000美元订阅费！	OpenAI发布GPT-5：这是一个包含实时路由的AI系统，而不仅仅是一个模型	Google发布Gemini 2.5 Pro: Gemini系列第一个2.5版本的模型，最高支持200万上下文，全模态输入，推理大模型，LMArena排名第一

API pricing

API price comparison

Side-by-side input/output token pricing

Loading comparison...

大模型评测对比结果

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 4 个模型的评测数据与核心参数。

Grok 4 FastGrok 4GPT-5Gemini 2.5-Pro

规格对比

xAI

Grok 4 Fast

Release2025-09-19

Context length2000K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）

Model profile Playground

xAI

Grok 4

Release2025-07-10

Context length256K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

OpenAI

GPT-5

Release2025-08-07

Context length400K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

Google Deep Mind

Gemini 2.5-Pro

Release2025-06-05

Context length1000K

Parameters0

常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）

Model profile Playground

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

All Modes

Shortcuts

Thinking Mode (Default)

Thinking Mode (Default) - Help

Default: Thinking Mode (Default) (Standard/Medium)
All: Thinking Mode (All)

All Tools & Parallel

Best Overall

Grok 4 · 75.85

Best Single

GPT-5 · AIME2025 99.60

Thinking Mode (Default)

GPT-5 · 4 All Modes

Benchmark scores

Higher is usually better; “—” means no score.

Filter: All Modes14 All Modes · 6 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

Benchmark scores

Higher is usually better; “—” means no score.

6 Benchmark14 All Modes

Supported modes:NormalThinkDeepToolParallel

Benchmark	GR Grok 4 FastxAI			GR Grok 4xAI				GP GPT-5OpenAI					GE Gemini 2.5-ProGoogle Deep Mind
Benchmark
综合评估
GPQA Diamond	—	85.70	—	—	87.00	—	—	77.80	85.70	—	—	87.30	—	86.40
HLE	—	20.00	—	—	25.40	38.60	38.60	6.30	—	—	24.80	35.20	—	21.60
LiveBench	68.09	—	—	72.84	—	—	—	—	79.33	78.85	—	—	—	71.92
常识问答
SimpleQA	—	—	95.00	—	—	—	—	—	—	—	—	—	54.00	—
编程与软件工程
LiveCodeBench	—	80.00	—	—	82.00	—	—	—	—	—	—	—	77.10	—
数学推理
AIME2025	—	92.00	—	—	91.70	98.80	—	61.90	—	—	94.60	99.60	—	88.00

Feature compare

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	GR Grok 4 FastxAI	GR Grok 4xAI	GP GPT-5OpenAI	GE Gemini 2.5-ProGoogle Deep Mind
Model snapshots
Organization	xAI	xAI	OpenAI	Google Deep Mind
模型全名	Grok 4 Fast	Grok 4	GPT-5	Gemini 2.5-Pro
模型简介	Not provided	Not provided	Not provided	Not provided
模型类型	聊天大模型	推理大模型	基础大模型	推理大模型
模型代号	Grok-4-Fast	grok-4	gpt-5	gemini-2_5-pro-preview-06-05
Release	2025-09-19	2025-07-10	2025-08-07	2025-06-05
MoE	No	No	No	No
规格与性能
Context length	2000K	256K	400K	1000K
Parameters	—	—	—	—
激活参数量	Not provided	Not provided	Not provided	Not provided
模型规模	未知	未知	未知	未知
模型大小	Not provided	Not provided	Not provided	Not provided
推理速度
推理等级
最大输出	4096	262144	131072	65536
Supported modes	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）深度思考（Deeper Thinking Mode）
开源与许可
Code Open Source	Not provided	Not provided	Not provided	Not provided
Weights Open Source	Not provided	Not provided	Not provided	Not provided
Commercial use	不开源	不开源	不开源	不开源
Modality support
Text Input/Output	/	/	/	/
Image Input/Output	/	/	/	/
Audio Input/Output	/	/	/	/
Video Input/Output	/	/	/	/
Embedding Input/Output	/	/	/	/
API 接口详情
Text 价格	Input: 0.2 美元/100万 tokensOutput: 0.5 美元/100万 tokens	Input: 3 美元/100 万tokensOutput: 15 美元/100 万tokens	Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokens	Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokensCache: 0.125 美元/100 万tokensInput (Extended): 2.5 美元/100 万tokensOutput (Extended): 15 美元/100 万tokensThreshold: 200K
Image API pricing	Input: 0.2 美元/100万 tokens	Input: 3 美元/100 万tokens	Not provided	Input: 1.25 美元/100 万tokensCache: 0.125 美元/100 万tokens
Audio API pricing	Not provided	Not provided	Not provided	Not provided
Video API pricing	Not provided	Not provided	Not provided	Not provided
Embedding API pricing	Not provided	Not provided	Not provided	Not provided
Resources
GitHub	Not provided	Not provided	Not provided	Not provided
Hugging Face	Not provided	Not provided	Not provided	Not provided
Official Page	Not provided	Not provided	Not provided	Not provided
Guides	Not provided	Not provided	Not provided	Not provided
Papers	Grok 4 Fast Pushing the Frontier of Cost-Efficient Intelligence	Grok 4	Introducing GPT-5	Try the latest Gemini 2.5 Pro before general availability.
DataLearnerAI	大模型速度、效果与价格的完美结合？xAI发布Grok 4 Fast：性能接近Grok 4，成本降 98%，生成速度翻倍！	AIME 2025满分，xAI正式发布Grok模型，其中Grok 4 Heavy评测超越当前所有大模型，美国数学竞赛满分！一年3000美元订阅费！	OpenAI发布GPT-5：这是一个包含实时路由的AI系统，而不仅仅是一个模型	Google发布Gemini 2.5 Pro: Gemini系列第一个2.5版本的模型，最高支持200万上下文，全模态输入，推理大模型，LMArena排名第一

API pricing

API price comparison

Side-by-side input/output token pricing

Grok 4 Fast

Grok 4

GPT-5

Gemini 2.5-Pro

Performance benchmarks

Benchmark scores

Benchmark score table

Benchmark scores

Detailed feature breakdown

Model snapshots

规格与性能

开源与许可

Modality support

API 接口详情

Resources

API price comparison

Grok 4 Fast

Grok 4

GPT-5

Gemini 2.5-Pro

Performance benchmarks

Benchmark scores

Benchmark score table

Benchmark scores

Detailed feature breakdown

Model snapshots

规格与性能

开源与许可

Modality support

API 接口详情

Resources

API price comparison