DeepSeek-V4-ProvsDeepSeek-V3.1
Across 5 shared benchmarks, DeepSeek-V3.1 leads overall: DeepSeek-V4-Pro wins 2, DeepSeek-V3.1 wins 3, with 0 ties and an average score difference of -0.60.
DeepSeek-V4-Pro
DeepSeek-AI · 2026-04-24 · Reasoning model
DeepSeek-V3.1
DeepSeek-AI · 2025-08-20 · AI model
DeepSeek-V4-Pro2 wins(40%)(60%)3 winsDeepSeek-V3.1
Benchmark scores
Grouped by capability, sorted by largest gap within each. 5 shared benchmarks.
General Knowledge
DeepSeek-V3.1 3/3| Benchmark | DeepSeek-V4-Pro | DeepSeek-V3.1 | Diff |
|---|---|---|---|
| HLE | 7.70133 / 149Normal (No Tools) | 15.90110 / 149thinking | -8.20 |
| GPQA Diamond | 72.9099 / 175Normal (No Tools) | 74.9092 / 175 | -2 |
| MMLU Pro | 82.9044 / 124Normal (No Tools) | 83.7039 / 124 | -0.80 |
Specs
| Field | DeepSeek-V4-Pro | DeepSeek-V3.1 |
|---|---|---|
| Publisher | DeepSeek-AI | DeepSeek-AI |
| Release date | 2026-04-24 | 2025-08-20 |
| Model type | Reasoning model | AI model |
| Architecture | MoE | MoE |
| Parameters | 16000.0 | 6710.0 |
| Context length | 1M | 128K |
| Max output | 384000 | 8192 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | DeepSeek-V4-Pro | DeepSeek-V3.1 |
|---|---|---|
| Text input | $1.74 / 1M tokens | 0.56 美元/100 万tokens |
| Text output | $3.48 / 1M tokens | 1.68 美元/100 万tokens |
| Cache read | $0.145 / 1M tokens | Not public |
| Cache write | $1.74 / 1M tokens | Not public |
Summary
- DeepSeek-V4-Proleads in:Coding and Software Engineer (2/2)
- DeepSeek-V3.1leads in:General Knowledge (3/3)
On average across the 5 shared benchmarks, DeepSeek-V3.1 scores 0.60 higher.
Largest single-benchmark gap: HLE — DeepSeek-V4-Pro 7.70 vs DeepSeek-V3.1 15.90 (-8.20).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.