GPT-4o(2024-11-20)vsClaude3-Opus

Across 4 shared benchmarks, GPT-4o(2024-11-20) leads overall: GPT-4o(2024-11-20) wins 3, Claude3-Opus wins 1, with 0 ties and an average score difference of +5.51.

GPT-4o(2024-11-20)

OpenAI · 2024-11-20 · AI model

Claude3-Opus

Anthropic · 2024-03-04 · Multimodal model

GPT-4o(2024-11-20)3 wins(75%)(25%)1 winClaude3-Opus

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

General Knowledge

Even 2/2

Benchmark	GPT-4o(2024-11-20)	Claude3-Opus	Diff
MMLU Pro	77.9070 / 124	68.4593 / 124	+9.45
MMLU	85.7037 / 65	86.8027 / 65	-1.10

Coding and Software Engineer

GPT-4o(2024-11-20) 1/1

Benchmark	GPT-4o(2024-11-20)	Claude3-Opus	Diff
HumanEval

Specs

Field	GPT-4o(2024-11-20)	Claude3-Opus
Publisher	OpenAI	Anthropic
Release date	2024-11-20	2024-03-04
Model type	AI model	Multimodal model
Architecture	Dense	Dense
Parameters	Not available	0.0
Context length	128K	200K
Max output	Not available	Not available

Summary

GPT-4o(2024-11-20)leads in:Coding and Software Engineer (1/1), Math and Reasoning (1/1)
Tied in:General Knowledge

On average across the 4 shared benchmarks, GPT-4o(2024-11-20) scores 5.51 higher.

Largest single-benchmark gap: MMLU Pro — GPT-4o(2024-11-20) 77.90 vs Claude3-Opus 68.45 (+9.45).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

GPT-4o(2024-11-20) details Claude3-Opus details·Customize in compare tool

Benchmark scores

General Knowledge

Coding and Software Engineer

Specs

Summary

Math and Reasoning