GPT-4o(2024-11-20)vsGPT-4o
GPT-4o(2024-11-20) and GPT-4o are tied across 7 shared benchmarks: GPT-4o(2024-11-20) leads on 2, GPT-4o leads on 2, with 3 ties and an average score difference of -1.37.
GPT-4o(2024-11-20)
OpenAI · 2024-11-20 · AI model
GPT-4o
OpenAI · 2024-05-13 · Multimodal model
GPT-4o(2024-11-20)2 wins(29%)Ties3(29%)2 winsGPT-4o
Benchmark scores
Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.
Coding and Software Engineer
GPT-4o(2024-11-20) 1/2| Benchmark | GPT-4o(2024-11-20) | GPT-4o | Diff |
|---|---|---|---|
| HumanEval | 90.207 / 39 | 908 / 39 | +0.20 |
| SWE-bench Verified | 3198 / 103Normal (No Tools) | 3198 / 103 | — |
General Knowledge
GPT-4o 1/2| Benchmark | GPT-4o(2024-11-20) | GPT-4o | Diff |
|---|---|---|---|
Specs
| Field | GPT-4o(2024-11-20) | GPT-4o |
|---|---|---|
| Publisher | OpenAI | OpenAI |
| Release date | 2024-11-20 | 2024-05-13 |
| Model type | AI model | Multimodal model |
| Architecture | Dense | Dense |
| Parameters | Not available | 0.0 |
| Context length | 128K | 128K |
| Max output | Not available | 16384 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | GPT-4o(2024-11-20) | GPT-4o |
|---|---|---|
| Text input | Not public | 2.5 美元/100万 tokens |
| Text output | Not public | 10 美元/100万 tokens |
One or both models have incomplete public pricing.
Summary
- GPT-4o(2024-11-20)leads in:Coding and Software Engineer (1/2), Common Sense (1/1)
- GPT-4oleads in:General Knowledge (1/2), Math and Reasoning (1/2)
On average across the 7 shared benchmarks, GPT-4o scores 1.37 higher.
Largest single-benchmark gap: MATH — GPT-4o(2024-11-20) 68.50 vs GPT-4o 75.90 (-7.40).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.