Gemini 3.5 FlashvsGPT-5.5
Across 4 shared benchmarks, GPT-5.5 leads overall: Gemini 3.5 Flash wins 0, GPT-5.5 wins 4, with 0 ties and an average score difference of -7.18.
Gemini 3.5 Flash
Google Deep Mind · 2026-06-20 · Multimodal model
GPT-5.5
OpenAI · 2026-04-23 · Reasoning model
Gemini 3.5 Flash0 wins(0%)(100%)4 winsGPT-5.5
Benchmark scores
Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.
General Knowledge
GPT-5.5 2/2| Benchmark | Gemini 3.5 Flash | GPT-5.5 | Diff |
|---|---|---|---|
| ARC-AGI-2 | 72.1011 / 59Thinking High (With Tools) | 851 / 59极高强度思考(无工具) | -12.90 |
| HLE | 40.2045 / 150Thinking High (With Tools) | 52.2010 / 150Thinking High (With Tools) | -12 |
AI Agent - Tool Usage
GPT-5.5 1/1| Benchmark |
|---|
Specs
| Field | Gemini 3.5 Flash | GPT-5.5 |
|---|---|---|
| Publisher | Google Deep Mind | OpenAI |
| Release date | 2026-06-20 | 2026-04-23 |
| Model type | Multimodal model | Reasoning model |
| Architecture | Dense | Dense |
| Parameters | 0.0 | 0.0 |
| Context length | 1M | 1000K |
| Max output | 65536 | 131072 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | Gemini 3.5 Flash | GPT-5.5 |
|---|---|---|
| Text input | $1.5 / 1M tokens | $0.5 / 1M tokens |
| Text output | $9 / 1M tokens | $30 / 1M tokens |
| Cache read | Not public | $0.5 / 1M tokens |
| Cache write | Not public | $6.25 / 1M tokens |
Summary
- GPT-5.5leads in:General Knowledge (2/2), AI Agent - Tool Usage (1/1), Coding and Software Engineer (1/1)
On average across the 4 shared benchmarks, GPT-5.5 scores 7.18 higher.
Largest single-benchmark gap: ARC-AGI-2 — Gemini 3.5 Flash 72.10 vs GPT-5.5 85 (-12.90).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.