DataLearner logoDataLearnerAI
Latest AI Insights
Model Evaluations
Model Directory
Model Comparison
Resource Center
Tool Directory

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

产品

  • Leaderboards
  • 模型对比
  • Datasets

资源

  • Tutorials
  • Editorial
  • Tool directory

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

隐私政策服务条款
Page navigation
目录
Model catalogDeepSeek V3.2Benchmark analysis

DeepSeek V3.2 Benchmark Details

Below are DeepSeek V3.2's benchmark scores and model comparisons. In-depth analysis is being prepared.

Benchmark Results

DeepSeek V3.2

Benchmark Results

Thinking

综合评估

2 evaluations
Benchmark / mode
Score
Rank/total
GPQA Diamond
default
82.40
49 / 161
HLE
default
25.10
53 / 115

编程与软件工程

5 evaluations
Benchmark / mode
Score
Rank/total
CodeForces
default
2386
7 / 10
LiveCodeBench
default
83.30
12 / 104
SWE-bench Verified
default
73.10
43 / 92
SWE-bench Verified
default
70.20
43 / 92
SWE-Bench Pro - Public
default
40.90
12 / 17

数学推理

2 evaluations
Benchmark / mode
Score
Rank/total
AIME2025
default
93.10
31 / 108
AIME 2026
default
92.70
3 / 7

Agent能力评测

2 evaluations
Benchmark / mode
Score
Rank/total
τ²-Bench
default
80.30
12 / 36
Aider-Polyglot
default
69.90
12 / 26

AI Agent - 信息收集

1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
default
51.40
24 / 34

AI Agent - 工具使用

1 evaluations
Benchmark / mode
Score
Rank/total
Terminal Bench 2.0
default
46.40
22 / 26
与其他模型对比