DeepSeek V3.2vsDeepSeek-V3-0324

Across 8 shared benchmarks, DeepSeek V3.2 leads overall: DeepSeek V3.2 wins 8, DeepSeek-V3-0324 wins 0, with 0 ties and an average score difference of +31.50.

DeepSeek-AI
DeepSeek V3.2

DeepSeek-AI · 2025-12-01 · Reasoning model

DeepSeek-AI
DeepSeek-V3-0324

DeepSeek-AI · 2025-03-24 · Chat model

DeepSeek V3.28 wins(100%)(0%)0 winsDeepSeek-V3-0324

Benchmark scores

Grouped by capability, sorted by largest gap within each. 8 shared benchmarks.

General Knowledge

DeepSeek V3.2 3/3
BenchmarkDeepSeek V3.2DeepSeek-V3-0324Diff
ARC-AGI5738 / 65Thinking (No Tools)959 / 65+48
HLE25.1087 / 157Thinking (No Tools)5.20150 / 157+19.90
GPQA Diamond82.4064 / 178Thinking (No Tools)68.40119 / 178+14

Agent Level Benchmark

DeepSeek V3.2 2/2
BenchmarkDeepSeek V3.2DeepSeek-V3-0324Diff
τ²-Bench80.3014 / 4038.8036 / 40+41.50
Aider-Polyglot69.9012 / 2655.1021 / 26+14.80

Coding and Software Engineer

DeepSeek V3.2 2/2
BenchmarkDeepSeek V3.2DeepSeek-V3-0324Diff
SWE-bench Verified73.1045 / 10838.8099 / 108+34.30
LiveCodeBench83.3021 / 120Thinking (No Tools)49.2093 / 120+34.10

Math and Reasoning

DeepSeek V3.2 1/1
BenchmarkDeepSeek V3.2DeepSeek-V3-0324Diff
AIME202593.1030 / 106Thinking (No Tools)47.7088 / 106+45.40

Specs

FieldDeepSeek V3.2DeepSeek-V3-0324
PublisherDeepSeek-AIDeepSeek-AI
Release date2025-12-012025-03-24
Model typeReasoning modelChat model
ArchitectureMoEMoE
Parameters671B671B
Context length128K128K
Max output8KNot available

Summary

  • DeepSeek V3.2leads in:General Knowledge (3/3), Agent Level Benchmark (2/2), Coding and Software Engineer (2/2), Math and Reasoning (1/1)

On average across the 8 shared benchmarks, DeepSeek V3.2 scores 31.50 higher.

Largest single-benchmark gap: ARC-AGI — DeepSeek V3.2 57 vs DeepSeek-V3-0324 9 (+48).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.