Llama3.3-70B-Instruct Benchmark Details

Llama3.3-70B-Instruct currently shows benchmark results led by MBPP (3 / 28, score 87.60), MATH (13 / 42, score 77), HumanEval (14 / 39, score 88.40).

Benchmark Results

Llama3.3-70B-Instruct

Benchmark Results

Thinking

General Knowledge

3 evaluations
Benchmark / mode
Score
Rank/total
86
33 / 65
68.90
94 / 126
50.50
152 / 179

Coding and Software Engineer

3 evaluations
Benchmark / mode
Score
Rank/total
88.40
14 / 39
87.60
3 / 28
33.30
109 / 120

Math and Reasoning

1 evaluations
Benchmark / mode
Score
Rank/total
77
13 / 42