Llama-3.2-3B Benchmark Details

Llama-3.2-3B currently shows benchmark results led by GSM8K (23 / 26, score 34), BBH (19 / 21, score 46.80), GPQA Diamond (179 / 187, score 26.60).

Benchmark Results

Llama-3.2-3B

Benchmark Results

General Knowledge

4 evaluations

Benchmark / mode

Score

Rank/total

54.75

65 / 66

46.80

19 / 21

26.60

179 / 187

25

131 / 132

Math and Reasoning

2 evaluations

Benchmark / mode

Score

Rank/total

34

23 / 26

8.50

42 / 42

Coding and Software Engineer

2 evaluations

Benchmark / mode

Score

Rank/total

48.70

27 / 28

28

38 / 39

Compare with other models