GPT-5.5 ProvsGPT-5.2 Pro
Across 5 shared benchmarks, GPT-5.5 Pro leads overall: GPT-5.5 Pro wins 5, GPT-5.2 Pro wins 0, with 0 ties and an average score difference of +12.82.
GPT-5.5 Pro
OpenAI · 2026-04-23 · Reasoning model
GPT-5.2 Pro
OpenAI · 2025-12-11 · Reasoning model
GPT-5.5 Pro5 wins(100%)(0%)0 winsGPT-5.2 Pro
Benchmark scores
Grouped by capability, sorted by largest gap within each. 5 shared benchmarks.
General Knowledge
GPT-5.5 Pro 3/3| Benchmark | GPT-5.5 Pro | GPT-5.2 Pro | Diff |
|---|---|---|---|
| ARC-AGI-2 | 84.603 / 59Thinking High (No Tools) | 54.2020 / 59 | +30.40 |
| HLE | 57.206 / 157极高强度思考(工具) | 5022 / 157 | +7.20 |
| ARC-AGI | 96.501 / 65Thinking High (No Tools) | 90.5015 / 65 | +6 |
AI Agent - Information Search
GPT-5.5 Pro 1/1| Benchmark | GPT-5.5 Pro | GPT-5.2 Pro | Diff |
|---|---|---|---|
| BrowseComp | 90.101 / 45Deep Thinking (With Tools + Internet) | 77.9016 / 45 | +12.20 |
Math and Reasoning
GPT-5.5 Pro 1/1| Benchmark | GPT-5.5 Pro | GPT-5.2 Pro | Diff |
|---|---|---|---|
| FrontierMath - Tier 4 | 39.601 / 80Thinking High (No Tools) | 31.309 / 80 | +8.30 |
Specs
| Field | GPT-5.5 Pro | GPT-5.2 Pro |
|---|---|---|
| Publisher | OpenAI | OpenAI |
| Release date | 2026-04-23 | 2025-12-11 |
| Model type | Reasoning model | Reasoning model |
| Architecture | Dense | Dense |
| Parameters | Not available | Not available |
| Context length | 1000K | 256K |
| Max output | 128K | Not available |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | GPT-5.5 Pro | GPT-5.2 Pro |
|---|---|---|
| Text input | $30 / 1M tokens | Not public |
| Text output | $180 / 1M tokens | Not public |
One or both models have incomplete public pricing.
Summary
- GPT-5.5 Proleads in:General Knowledge (3/3), AI Agent - Information Search (1/1), Math and Reasoning (1/1)
On average across the 5 shared benchmarks, GPT-5.5 Pro scores 12.82 higher.
Largest single-benchmark gap: ARC-AGI-2 — GPT-5.5 Pro 84.60 vs GPT-5.2 Pro 54.20 (+30.40).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.