GPT-5vsGPT-4o(2025-03-27)
Across 3 shared benchmarks, GPT-5 leads overall: GPT-5 wins 2, GPT-4o(2025-03-27) wins 1, with 0 ties and an average score difference of +14.43.
GPT-5
OpenAI · 2025-08-07 · Foundation model
GPT-4o(2025-03-27)
OpenAI · 2025-03-27 · AI model
GPT-52 wins(67%)(33%)1 winGPT-4o(2025-03-27)
Benchmark scores
Grouped by capability, sorted by largest gap within each. 3 shared benchmarks.
General Knowledge
Even 2/2| Benchmark | GPT-5 | GPT-4o(2025-03-27) | Diff |
|---|---|---|---|
| GPQA Diamond | 77.8081 / 175 | 66.90121 / 175 | +10.90 |
| ARC-AGI | 661 / 65 | 8.8060 / 65 | -2.80 |
Math and Reasoning
GPT-5 1/1| Benchmark | GPT-5 | GPT-4o(2025-03-27) | Diff |
|---|---|---|---|
| AIME2025 |
Specs
| Field | GPT-5 | GPT-4o(2025-03-27) |
|---|---|---|
| Publisher | OpenAI | OpenAI |
| Release date | 2025-08-07 | 2025-03-27 |
| Model type | Foundation model | AI model |
| Architecture | Dense | Dense |
| Parameters | 0.0 | 0.0 |
| Context length | 400K | 128K |
| Max output | 131072 | 4096 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | GPT-5 | GPT-4o(2025-03-27) |
|---|---|---|
| Text input | 1.25 美元/100 万tokens | 2.5 美元/100万 tokens |
| Text output | 10 美元/100 万tokens | 10 美元/100万 tokens |
Summary
- GPT-5leads in:Math and Reasoning (1/1)
- Tied in:General Knowledge
On average across the 3 shared benchmarks, GPT-5 scores 14.43 higher.
Largest single-benchmark gap: AIME2025 — GPT-5 61.90 vs GPT-4o(2025-03-27) 26.70 (+35.20).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.