See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

Gemini 3.5 Flash
Google Deep Mind
Best overall
Gemini 3.5 Flash · 63.57
Best single
Gemini 3.5 Flash · OSWorld-Verified 78.40
Modality coverage
Gemini 3.5 Flash · 1 modalities
Head to head
3
Benchmarks
2
Wins
1
Losses
+3.63
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
3 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Gemini 3.5 Flash | Claude Sonnet 4.6 |
|---|---|---|
ARC-AGI-2 综合评估 | 72.10Thinking Level · High | Tools | 58.30Thinking Enabled |
HLE 综合评估 | 40.20Thinking Level · High | Tools | 49.00Thinking Enabled | Tools |
OSWorld-Verified AI Agent - 工具使用 | 78.40Thinking Level · High | Tools | 72.50Thinking Enabled | Tools |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | Gemini 3.5 FlashGoogle Deep Mind | Claude Sonnet 4.6Anthropic |
|---|---|---|
Core specsRelease | 2026-06-20 | 2026-02-17 |
Context length | 1M | 1M |
Max output | 65536 | 8192 |
MoE | No | No |
Supported modes | No mode data | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode) |
LicenseCode Open Source | Not provided | Not provided |
Weights Open Source | Not provided | Not provided |
Commercial use | 不开源 | 不开源 |
Modality supportText Input/Output | / | / |
ResourcesPaper / report | Gemini 3.5: frontier intelligence with action | Introducing Claude Sonnet 4.6 |

Claude Sonnet 4.6
Anthropic