DeepSeek V3.2vsDeepSeek-V3.1
Across 6 shared benchmarks, DeepSeek V3.2 leads overall: DeepSeek V3.2 wins 5, DeepSeek-V3.1 wins 1, with 0 ties and an average score difference of +4.23.
DeepSeek V3.2
DeepSeek-AI · 2025-12-01 · Reasoning model
DeepSeek-V3.1
DeepSeek-AI · 2025-08-20 · Chat model
DeepSeek V3.25 wins(83%)(17%)1 winDeepSeek-V3.1
Benchmark scores
Grouped by capability, sorted by largest gap within each. 6 shared benchmarks.
Coding and Software Engineer
DeepSeek V3.2 2/2| Benchmark | DeepSeek V3.2 | DeepSeek-V3.1 | Diff |
|---|---|---|---|
| LiveCodeBench | 83.3021 / 120Thinking (No Tools) | 74.8040 / 120 | +8.50 |
| SWE-bench Verified | 73.1045 / 108 | 6670 / 108 | +7.10 |
General Knowledge
DeepSeek V3.2 2/2| Benchmark | DeepSeek V3.2 | DeepSeek-V3.1 | Diff |
|---|---|---|---|
| HLE | 25.1087 / 157Thinking (No Tools) | 15.90118 / 157 | +9.20 |
| GPQA Diamond | 82.4064 / 178Thinking (No Tools) | 80.1075 / 178 | +2.30 |
Agent Level Benchmark
DeepSeek-V3.1 1/1| Benchmark | DeepSeek V3.2 | DeepSeek-V3.1 | Diff |
|---|---|---|---|
| Aider-Polyglot | 69.9012 / 26 | 76.305 / 26 | -6.40 |
Math and Reasoning
DeepSeek V3.2 1/1| Benchmark | DeepSeek V3.2 | DeepSeek-V3.1 | Diff |
|---|---|---|---|
| AIME2025 | 93.1030 / 106Thinking (No Tools) | 88.4042 / 106 | +4.70 |
Specs
| Field | DeepSeek V3.2 | DeepSeek-V3.1 |
|---|---|---|
| Publisher | DeepSeek-AI | DeepSeek-AI |
| Release date | 2025-12-01 | 2025-08-20 |
| Model type | Reasoning model | Chat model |
| Architecture | MoE | MoE |
| Parameters | 671B | 671B |
| Context length | 128K | 128K |
| Max output | 8K | 8K |
Summary
- DeepSeek V3.2leads in:Coding and Software Engineer (2/2), General Knowledge (2/2), Math and Reasoning (1/1)
- DeepSeek-V3.1leads in:Agent Level Benchmark (1/1)
On average across the 6 shared benchmarks, DeepSeek V3.2 scores 4.23 higher.
Largest single-benchmark gap: HLE — DeepSeek V3.2 25.10 vs DeepSeek-V3.1 15.90 (+9.20).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.