Gemini 3.5 FlashvsOpus 4.7

Gemini 3.5 Flash 与 Opus 4.7 在 8 个共同 benchmark 中整体接近:Gemini 3.5 Flash 领先 4 项,Opus 4.7 领先 4 项,持平 0 项,平均分差 -0.36。

Google Deep Mind
Gemini 3.5 Flash

Google Deep Mind · 2026-06-20 · 多模态大模型

Anthropic
Opus 4.7

Anthropic · 2026-04-16 · 推理大模型

Gemini 3.5 Flash4 (50%)(50%)4 Opus 4.7

评测分数

按能力类目分组,每组内按分差大小排列;共 8 项。

AI Agent - Tool Usage

Gemini 3.5 Flash 领先 3/3
评测项Gemini 3.5 FlashOpus 4.7分差
TerminalBench 2.176.208 / 16Thinking High (With Tools)69.7011 / 16Thinking High (With Tools)+6.50
MCP-Atlas83.601 / 23Thinking High (With Tools)79.105 / 23Deep Thinking (With Tools)+4.50
OSWorld-Verified78.406 / 19Thinking High (With Tools)787 / 19Extended (with tools)+0.40

General Knowledge

Opus 4.7 领先 3/3
评测项Gemini 3.5 FlashOpus 4.7分差
HLE40.2055 / 161Thinking High (With Tools)54.709 / 161Extended (with tools)-14.50
ARC-AGI-272.1011 / 59Thinking High (With Tools)75.809 / 59最高(无工具)-3.70
LiveBench75.0217 / 115Thinking High (No Tools)76.917 / 115Deep Thinking (No Tools)-1.89

Coding and Software Engineer

Opus 4.7 领先 1/1
评测项Gemini 3.5 FlashOpus 4.7分差
SWE-Bench Pro - Public55.1021 / 44Thinking High (With Tools)64.304 / 44Extended (with tools)-9.20

常识推理

Gemini 3.5 Flash 领先 1/1
评测项Gemini 3.5 FlashOpus 4.7分差
Simple Bench76.704 / 63Normal (No Tools)61.7013 / 63Normal (No Tools)+15

规格对比

字段Gemini 3.5 FlashOpus 4.7
发布机构Google Deep MindAnthropic
发布时间2026-06-202026-04-16
模型类型多模态大模型推理大模型
架构稠密模型稠密模型
参数规模暂无数据暂无数据
上下文长度1M1000K
最大输出64K128K

API 调用价格

价格优先使用 DataLearner 配置的 API 记录;缺失项不做推测。

价格项Gemini 3.5 FlashOpus 4.7
文本输入$1.5 / 1M tokens$5 / 1M tokens
文本输出$9 / 1M tokens$25 / 1M tokens
缓存读取暂无公开价格$0.5 / 1M tokens
缓存写入暂无公开价格$6.25 / 1M tokens

小结

  • Gemini 3.5 Flash在以下类目领先:AI Agent - Tool Usage (3/3)、常识推理 (1/1)
  • Opus 4.7在以下类目领先:General Knowledge (3/3)、Coding and Software Engineer (1/1)

8 个共同 benchmark 上,Opus 4.7 平均高出 0.36 分。

单项差距最大的 benchmark:Simple Bench — Gemini 3.5 Flash 76.70,Opus 4.7 61.70(分差 +15)。

本页正文由结构化模型、价格与 benchmark 数据生成,不使用实时 LLM 撰写。