DeepSeek-R1-0528 Benchmark Details

DeepSeek-R1-0528 currently shows benchmark results led by MATH-500 (7 / 44, score 98), Creative Writing (4 / 23, score 86.25), MMLU Pro (25 / 126, score 85).

Benchmark Results

DeepSeek-R1-0528

Benchmark Results

Thinking

General Knowledge

5 evaluations
Benchmark / mode
Score
Rank/total
85
25 / 126
81
70 / 179
21.20
54 / 65
17.70
113 / 159
1.30
52 / 59

Common Sense

1 evaluations
Benchmark / mode
Score
Rank/total
27.80
25 / 45

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
73.30
45 / 120
57.60
80 / 108

Math and Reasoning

5 evaluations
Benchmark / mode
Score
Rank/total
98
7 / 44
91.40
13 / 62
87.50
44 / 106

Writing and Creative Capabilities

1 evaluations
Benchmark / mode
Score
Rank/total
86.25
4 / 23

AI Agent - Tool Usage

1 evaluations
Benchmark / mode
Score
Rank/total
5.70
35 / 35

常识推理

1 evaluations
Benchmark / mode
Score
Rank/total
Simple Bench
Thinking Enabled
40.80
38 / 63

Agent Level Benchmark

1 evaluations
Benchmark / mode
Score
Rank/total
Aider-Polyglot
Thinking Enabled
71.40
15 / 59