Gemini 2.0 Pro ExperimentalvsGPT-4o(2024-11-20)

Across 4 shared benchmarks, Gemini 2.0 Pro Experimental leads overall: Gemini 2.0 Pro Experimental wins 4, GPT-4o(2024-11-20) wins 0, with 0 ties and an average score difference of +7.70.

DeepMind
Gemini 2.0 Pro Experimental

DeepMind · 2025-02-05 · Chat model

OpenAI
GPT-4o(2024-11-20)

OpenAI · 2024-11-20 · Chat model

Gemini 2.0 Pro Experimental4 wins(100%)(0%)0 winsGPT-4o(2024-11-20)

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

General Knowledge

Gemini 2.0 Pro Experimental 2/2
BenchmarkGemini 2.0 Pro ExperimentalGPT-4o(2024-11-20)Diff
MMLU Pro79.1062 / 12677.9072 / 126+1.20
MMLU86.5028 / 6585.7037 / 65+0.80

Common Sense

Gemini 2.0 Pro Experimental 1/1
BenchmarkGemini 2.0 Pro ExperimentalGPT-4o(2024-11-20)Diff
SimpleQA44.3015 / 4538.8019 / 45+5.50

Math and Reasoning

Gemini 2.0 Pro Experimental 1/1
BenchmarkGemini 2.0 Pro ExperimentalGPT-4o(2024-11-20)Diff
MATH91.804 / 4268.5024 / 42+23.30

Specs

FieldGemini 2.0 Pro ExperimentalGPT-4o(2024-11-20)
PublisherDeepMindOpenAI
Release date2025-02-052024-11-20
Model typeChat modelChat model
ArchitectureDenseDense
ParametersNot availableNot available
Context length2000K128K
Max output8KNot available

Summary

  • Gemini 2.0 Pro Experimentalleads in:General Knowledge (2/2), Common Sense (1/1), Math and Reasoning (1/1)

On average across the 4 shared benchmarks, Gemini 2.0 Pro Experimental scores 7.70 higher.

Largest single-benchmark gap: MATH — Gemini 2.0 Pro Experimental 91.80 vs GPT-4o(2024-11-20) 68.50 (+23.30).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.