Gemini 3.5 FlashvsClaude Sonnet 4.6
Across 6 shared benchmarks, Gemini 3.5 Flash leads overall: Gemini 3.5 Flash wins 4, Claude Sonnet 4.6 wins 2, with 0 ties and an average score difference of +5.26.
Gemini 3.5 Flash
Google Deep Mind · 2026-06-20 · Multimodal model
Claude Sonnet 4.6
Anthropic · 2026-02-17 · Chat model
Gemini 3.5 Flash4 wins(67%)(33%)2 winsClaude Sonnet 4.6
Benchmark scores
Grouped by capability, sorted by largest gap within each. 6 shared benchmarks.
General Knowledge
Claude Sonnet 4.6 2/3| Benchmark | Gemini 3.5 Flash | Claude Sonnet 4.6 | Diff |
|---|---|---|---|
| ARC-AGI-2 | 72.1011 / 59Thinking High (With Tools) | 58.3018 / 59 | +13.80 |
| HLE | 40.2055 / 161Thinking High (With Tools) | 4927 / 161 | -8.80 |
| LiveBench | 75.0217 / 115Thinking High (No Tools) | 75.4712 / 115Thinking Medium (No Tools) | -0.45 |
AI Agent - Tool Usage
Gemini 3.5 Flash 2/2| Benchmark | Gemini 3.5 Flash | Claude Sonnet 4.6 | Diff |
|---|---|---|---|
| MCP-Atlas | 83.601 / 23Thinking High (With Tools) | 69.5013 / 23Normal (With Tools) | +14.10 |
| OSWorld-Verified | 78.406 / 19Thinking High (With Tools) | 72.5011 / 19 | +5.90 |
Coding and Software Engineer
Gemini 3.5 Flash 1/1| Benchmark | Gemini 3.5 Flash | Claude Sonnet 4.6 | Diff |
|---|---|---|---|
| DeepSWE | 376 / 9Thinking Medium (With Tools) | 308 / 9Thinking High (With Tools) | +7 |
Specs
| Field | Gemini 3.5 Flash | Claude Sonnet 4.6 |
|---|---|---|
| Publisher | Google Deep Mind | Anthropic |
| Release date | 2026-06-20 | 2026-02-17 |
| Model type | Multimodal model | Chat model |
| Architecture | Dense | Dense |
| Parameters | Not available | Not available |
| Context length | 1M | 1M |
| Max output | 64K | 8K |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | Gemini 3.5 Flash | Claude Sonnet 4.6 |
|---|---|---|
| Text input | $1.5 / 1M tokens | $3 / 1M tokens |
| Text output | $9 / 1M tokens | $15 / 1M tokens |
| Cache read | Not public | $0.3 / 1M tokens |
| Cache write | Not public | $3.75 / 1M tokens |
Summary
- Gemini 3.5 Flashleads in:AI Agent - Tool Usage (2/2), Coding and Software Engineer (1/1)
- Claude Sonnet 4.6leads in:General Knowledge (2/3)
On average across the 6 shared benchmarks, Gemini 3.5 Flash scores 5.26 higher.
Largest single-benchmark gap: MCP-Atlas — Gemini 3.5 Flash 83.60 vs Claude Sonnet 4.6 69.50 (+14.10).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.