- Claude Sonnet 4.6leads in:General Knowledge (2/3), AI Agent - Information Search (1/1), AI Agent - Tool Usage (1/1), Claw-style Agent Evaluation (1/1), Coding and Software Engineer (1/1), Productivity Knowledge (1/1)
- Gemini 3.0 Pro (Preview 11-2025)leads in:Agent Level Benchmark (1/1), Math and Reasoning (1/1)
- Tied in:Long Context
On average across the 11 shared benchmarks, Claude Sonnet 4.6 scores 5.66 higher.
Largest single-benchmark gap: GDPval-AA — Claude Sonnet 4.6 57 vs Gemini 3.0 Pro (Preview 11-2025) 35 (+22).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.