DeepSeek V3.2vsDeepSeek-V3-0324

在 8 个共同 benchmark 中，DeepSeek V3.2 整体领先：DeepSeek V3.2 领先 8 项，DeepSeek-V3-0324 领先 0 项，持平 0 项，平均分差 +31.50。

DeepSeek-AI · 2025-12-01 · 推理大模型

DeepSeek-AI · 2025-03-24 · 聊天大模型

DeepSeek V3.28 项(100%)(0%)0 项DeepSeek-V3-0324

评测分数

按能力类目分组，每组内按分差大小排列；共 8 项。

DeepSeek V3.2 领先 3/3

评测项	DeepSeek V3.2	DeepSeek-V3-0324	分差
ARC-AGI	5738 / 65Thinking (No Tools)	959 / 65	+48
HLE	25.1087 / 157Thinking (No Tools)	5.20150 / 157	+19.90
GPQA Diamond	82.4064 / 178Thinking (No Tools)	68.40119 / 178	+14

DeepSeek V3.2 领先 2/2

评测项	DeepSeek V3.2	DeepSeek-V3-0324	分差
τ²-Bench	80.3014 / 40	38.8036 / 40	+41.50
Aider-Polyglot	69.9012 / 26	55.1021 / 26	+14.80

DeepSeek V3.2 领先 2/2

评测项	DeepSeek V3.2	DeepSeek-V3-0324	分差
SWE-bench Verified	73.1045 / 108	38.8099 / 108	+34.30
LiveCodeBench	83.3021 / 120Thinking (No Tools)	49.2093 / 120	+34.10

DeepSeek V3.2 领先 1/1

评测项	DeepSeek V3.2	DeepSeek-V3-0324	分差
AIME2025	93.1030 / 106Thinking (No Tools)	47.7088 / 106	+45.40

DeepSeek V3.2在以下类目领先:General Knowledge (3/3)、Agent Level Benchmark (2/2)、Coding and Software Engineer (2/2)、Math and Reasoning (1/1)

8 个共同 benchmark 上，DeepSeek V3.2 平均高出 31.50 分。

单项差距最大的 benchmark：ARC-AGI — DeepSeek V3.2 57，DeepSeek-V3-0324 9（分差 +48）。

本页正文由结构化模型、价格与 benchmark 数据生成，不使用实时 LLM 撰写。