DeepSeek V3.2vsDeepSeek-V3-0324

Across 8 shared benchmarks, DeepSeek V3.2 leads overall: DeepSeek V3.2 wins 8, DeepSeek-V3-0324 wins 0, with 0 ties and an average score difference of +31.50.

DeepSeek V3.2

DeepSeek-AI · 2025-12-01 · Reasoning model

DeepSeek-V3-0324

DeepSeek-AI · 2025-03-24 · Chat model

DeepSeek V3.28 wins(100%)(0%)0 winsDeepSeek-V3-0324

Benchmark scores

Grouped by capability, sorted by largest gap within each. 8 shared benchmarks.

General Knowledge

DeepSeek V3.2 3/3

Benchmark	DeepSeek V3.2	DeepSeek-V3-0324	Diff
ARC-AGI	5738 / 65Thinking (No Tools)	959 / 65	+48
HLE	25.1089 / 159Thinking (No Tools)	5.20152 / 159	+19.90
GPQA Diamond	82.4065 / 179Thinking (No Tools)	68.40120 / 179	+14

Agent Level Benchmark

DeepSeek V3.2 2/2

Benchmark	DeepSeek V3.2	DeepSeek-V3-0324	Diff
τ²-Bench	80.3014 / 40	38.8036 / 40	+41.50
Aider-Polyglot	69.9012 / 26	55.1021 / 26	+14.80

Coding and Software Engineer

DeepSeek V3.2 2/2

Benchmark	DeepSeek V3.2	DeepSeek-V3-0324	Diff
SWE-bench Verified	73.1045 / 108	38.8099 / 108	+34.30
LiveCodeBench	83.3021 / 120Thinking (No Tools)	49.2093 / 120	+34.10

Math and Reasoning

DeepSeek V3.2 1/1

Benchmark	DeepSeek V3.2	DeepSeek-V3-0324	Diff
AIME2025	93.1030 / 106Thinking (No Tools)	47.7088 / 106	+45.40

Specs

Field	DeepSeek V3.2	DeepSeek-V3-0324
Publisher	DeepSeek-AI	DeepSeek-AI
Release date	2025-12-01	2025-03-24
Model type	Reasoning model	Chat model
Architecture	MoE	MoE
Parameters	671B	671B
Context length	128K	128K
Max output	8K	Not available

Summary

DeepSeek V3.2leads in:General Knowledge (3/3), Agent Level Benchmark (2/2), Coding and Software Engineer (2/2), Math and Reasoning (1/1)

On average across the 8 shared benchmarks, DeepSeek V3.2 scores 31.50 higher.

Largest single-benchmark gap: ARC-AGI — DeepSeek V3.2 57 vs DeepSeek-V3-0324 9 (+48).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.

DeepSeek V3.2 details DeepSeek-V3-0324 details·Customize in compare tool