加载中...
加载中...
See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 4 个模型的评测数据与核心参数。
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Performance benchmarks
Compare benchmark results across thinking modes and tool usage.
Best Overall
Grok 4 · 75.85
Best Single
GPT-5 · AIME2025 99.60
Thinking Mode (Default)
GPT-5 · 4 All Modes
Higher is usually better; “—” means no score.
Complete scores for each model/mode across selected benchmarks.
Higher is usually better; “—” means no score.
| Benchmark | GR Grok 4 FastxAI | GR Grok 4xAI | GP GPT-5OpenAI | GE Gemini 2.5-ProGoogle Deep Mind | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 综合评估 | ||||||||||||||
GPQA Diamond | — | 85.70 | — | — | 87.00 | — | — | 77.80 | 85.70 | — | — | 87.30 | — | 86.40 |
HLE | — | 20.00 | — | — | 25.40 | 38.60 | 38.60 | 6.30 | — | — | 24.80 | 35.20 | — | 21.60 |
LiveBench | 68.09 | — | — | 72.84 | — | — | — | — | 79.33 | 78.85 | — | — | — | 71.92 |
| 常识问答 | ||||||||||||||
SimpleQA | — | — | 95.00 | — | — | — | — | — | — | — | — | — | 54.00 | — |
| 编程与软件工程 | ||||||||||||||
LiveCodeBench | — | 80.00 | — | — | 82.00 | — | — | — | — | — | — | — | 77.10 | — |
| 数学推理 | ||||||||||||||
AIME2025 | — | 92.00 | — | — | 91.70 | 98.80 | — | 61.90 | — | — | 94.60 | 99.60 | — | 88.00 |
Feature compare
Licensing, MoE architecture, and multi-modality support.
| Features & specs | GR Grok 4 FastxAI | GR Grok 4xAI | GP GPT-5OpenAI | GE Gemini 2.5-ProGoogle Deep Mind |
|---|---|---|---|---|
Model snapshots | ||||
Organization | xAI | xAI | OpenAI | Google Deep Mind |
模型全名 | Grok 4 Fast | Grok 4 | GPT-5 | Gemini 2.5-Pro |
模型简介 | Not provided | Not provided | Not provided | Not provided |
模型类型 | 聊天大模型 | 推理大模型 | 基础大模型 | 推理大模型 |
模型代号 | Grok-4-Fast | grok-4 | gpt-5 | gemini-2_5-pro-preview-06-05 |
Release | 2025-09-19 | 2025-07-10 | 2025-08-07 | 2025-06-05 |
MoE | No | No | No | No |
规格与性能 | ||||
Context length | 2000K | 256K | 400K | 1000K |
Parameters | — | — | — | — |
激活参数量 | Not provided | Not provided | Not provided | Not provided |
模型规模 | 未知 | 未知 | 未知 | 未知 |
模型大小 | Not provided | Not provided | Not provided | Not provided |
推理速度 | ||||
推理等级 | ||||
最大输出 | 4096 | 262144 | 131072 | 65536 |
Supported modes | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode) | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode)深度思考(Deeper Thinking Mode) | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode)深度思考(Deeper Thinking Mode) | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode)深度思考(Deeper Thinking Mode) |
开源与许可 | ||||
Code Open Source | Not provided | Not provided | Not provided | Not provided |
Weights Open Source | Not provided | Not provided | Not provided | Not provided |
Commercial use | 不开源 | 不开源 | 不开源 | 不开源 |
Modality support | ||||
Text Input/Output | / | / | / | / |
Image Input/Output | / | / | / | / |
Audio Input/Output | / | / | / | / |
Video Input/Output | / | / | / | / |
Embedding Input/Output | / | / | / | / |
API 接口详情 | ||||
Text 价格 | Input: 0.2 美元/100万 tokensOutput: 0.5 美元/100万 tokens | Input: 3 美元/100 万tokensOutput: 15 美元/100 万tokens | Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokens | Input: 1.25 美元/100 万tokensOutput: 10 美元/100 万tokensCache: 0.125 美元/100 万tokensInput (Extended): 2.5 美元/100 万tokensOutput (Extended): 15 美元/100 万tokensThreshold: 200K |
Image API pricing | Input: 0.2 美元/100万 tokens | Input: 3 美元/100 万tokens | Not provided | Input: 1.25 美元/100 万tokensCache: 0.125 美元/100 万tokens |
Audio API pricing | Not provided | Not provided | Not provided | Not provided |
Video API pricing | Not provided | Not provided | Not provided | Not provided |
Embedding API pricing | Not provided | Not provided | Not provided | Not provided |
Resources | ||||
GitHub | Not provided | Not provided | Not provided | Not provided |
Hugging Face | Not provided | Not provided | Not provided | Not provided |
Official Page | Not provided | Not provided | Not provided | Not provided |
Guides | Not provided | Not provided | Not provided | Not provided |
Papers | Grok 4 Fast Pushing the Frontier of Cost-Efficient Intelligence | Grok 4 | Introducing GPT-5 | Try the latest Gemini 2.5 Pro before general availability. |
DataLearnerAI | 大模型速度、效果与价格的完美结合?xAI发布Grok 4 Fast:性能接近Grok 4,成本降 98%,生成速度翻倍! | AIME 2025满分,xAI正式发布Grok模型,其中Grok 4 Heavy评测超越当前所有大模型,美国数学竞赛满分!一年3000美元订阅费! | OpenAI发布GPT-5:这是一个包含实时路由的AI系统,而不仅仅是一个模型 | Google发布Gemini 2.5 Pro: Gemini系列第一个2.5版本的模型,最高支持200万上下文,全模态输入,推理大模型,LMArena排名第一 |
API pricing
Side-by-side input/output token pricing