GPT-4o(2024-11-20)vsGPT-4o

GPT-4o(2024-11-20) and GPT-4o are tied across 7 shared benchmarks: GPT-4o(2024-11-20) leads on 2, GPT-4o leads on 2, with 3 ties and an average score difference of -1.37.

GPT-4o(2024-11-20)

OpenAI · 2024-11-20 · AI model

GPT-4o

OpenAI · 2024-05-13 · Multimodal model

GPT-4o(2024-11-20)2 wins(29%)Ties3(29%)2 winsGPT-4o

Benchmark scores

Grouped by capability, sorted by largest gap within each. 7 shared benchmarks.

Coding and Software Engineer

GPT-4o(2024-11-20) 1/2

Benchmark	GPT-4o(2024-11-20)	GPT-4o	Diff
HumanEval	90.207 / 39	908 / 39	+0.20
SWE-bench Verified	3198 / 103Normal (No Tools)	3198 / 103	—

General Knowledge

GPT-4o 1/2

Benchmark	GPT-4o(2024-11-20)	GPT-4o	Diff

Specs

Field	GPT-4o(2024-11-20)	GPT-4o
Publisher	OpenAI	OpenAI
Release date	2024-11-20	2024-05-13
Model type	AI model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	0.0
Context length	128K	128K
Max output	Not available	16384

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

Item	GPT-4o(2024-11-20)	GPT-4o
Text input	Not public	2.5 美元/100万 tokens
Text output	Not public	10 美元/100万 tokens

One or both models have incomplete public pricing.

Summary

GPT-4o(2024-11-20)leads in:Coding and Software Engineer (1/2), Common Sense (1/1)
GPT-4oleads in:General Knowledge (1/2), Math and Reasoning (1/2)

On average across the 7 shared benchmarks, GPT-4o scores 1.37 higher.

Largest single-benchmark gap: MATH — GPT-4o(2024-11-20) 68.50 vs GPT-4o 75.90 (-7.40).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

GPT-4o(2024-11-20) details GPT-4o details·Customize in compare tool

Benchmark	GPT-4o(2024-11-20)	GPT-4o	Diff
MATH	68.5024 / 42	75.9016 / 42	-7.40
FrontierMath	0.3057 / 60	0.3057 / 60	—