DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Page navigation
目录
Model catalogDeepSeek-V4-FlashBenchmark analysis

DeepSeek-V4-Flash Benchmark Details

DeepSeek-V4-Flash currently shows benchmark results led by LiveCodeBench (4 / 118, score 91.60), MMLU Pro (13 / 124, score 86.40), IMO-AnswerBench (2 / 17, score 88.40).

Benchmark Results

DeepSeek-V4-Flash

Benchmark Results

Thinking
All modesNormalThinking
Thinking mode details (2)
All thinking modesDefault (Max)Deep Thinking Mode
Tool usage
All modesWith toolsNo tools
Internet
All modesOfflineInternet enabled

综合评估

4 evaluations
Benchmark / mode
Score
Rank/total
GPQA Diamond
High
87.40
32 / 175
MMLU Pro
High
86.40
13 / 124
HLE
High
29.40
69 / 148
HLE
HighToolsInternet
40.30
43 / 148

编程与软件工程

5 evaluations
Benchmark / mode
Score
Rank/total
CodeForces
High
2816
5 / 16
LiveCodeBench
High
88.40
7 / 118
SWE-bench Verified
HighTools
78.60
17 / 103
SWE-bench Multilingual
HighTools
70.20
11 / 17
SWE-Bench Pro - Public
HighTools
52.30
21 / 36

AI Agent - 信息收集

1 evaluations
Benchmark / mode
Score
Rank/total
BrowseComp
HighToolsInternet
53.50
31 / 43

AI Agent - 工具使用

1 evaluations
Benchmark / mode
Score
Rank/total
Terminal Bench 2.0
HighTools
56.60
24 / 43

数学推理

1 evaluations
Benchmark / mode
Score
Rank/total
IMO-AnswerBench
High
85.10
7 / 17
Compare with other models