Llama3.1-8B Benchmark Details

Llama3.1-8B currently shows benchmark results led by BBH (15 / 20, score 57.70), GSM8K (21 / 26, score 55.30), MBPP (25 / 28, score 53.90).

Benchmark Results

Llama3.1-8B

Benchmark Results

Thinking

General Knowledge

4 evaluations
Benchmark / mode
Score
Rank/total
66.60
62 / 65
57.70
15 / 20
35.40
122 / 126
25.80
174 / 179

Math and Reasoning

2 evaluations
Benchmark / mode
Score
Rank/total
55.30
21 / 26
20.50
40 / 42

Coding and Software Engineer

2 evaluations
Benchmark / mode
Score
Rank/total
53.90
25 / 28
33.50
36 / 39

常识推理

1 evaluations
Benchmark / mode
Score
Rank/total
59.30
4 / 4