See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

GPT-5.2 Pro
OpenAI

Best overall
GPT-5.2 Pro · 63.84
Best single
GPT-5.2 Pro · GPQA Diamond 93.20
Modality coverage
GPT-5.2 Pro · 2 modalities
Head to head
5
Benchmarks
5
Wins
0
Losses
+13.44
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
5 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | GPT-5.2 Pro | Opus 4.5 |
|---|---|---|
ARC-AGI 综合评估 | 90.50Thinking Enabled | 80.00Extended Thinking |
ARC-AGI-2 综合评估 | 54.20Thinking Enabled | 37.60Extended Thinking |
GPQA Diamond 综合评估 | 93.20Thinking Enabled | 87.00Extended Thinking |
HLE 综合评估 | 50.00Thinking Enabled | Tools | 43.20Extended Thinking | Tools |
31.30Thinking Enabled | 4.20Standard Mode |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | GPT-5.2 ProOpenAI | Opus 4.5Anthropic |
|---|---|---|
Core specsRelease | 2025-12-11 | 2025-11-25 |
Context length | 256K | 200K |
Max output | Not provided | 65536 |
MoE | No | No |
Supported modes | 思考模式(Thinking Mode) | No mode data |
LicenseCode Open Source | Not provided | Not provided |
Weights Open Source | Not provided | Not provided |
Commercial use | 不开源 | 不开源 |
Modality supportText Input/Output | / | / |
Image Input/Output | / | / |
ResourcesPaper / report | Introducing GPT-5.2 | Introducing Claude Opus 4.5 |
Opus 4.5
Anthropic