Claude 3.5 Haiku Benchmark Details

Claude 3.5 Haiku currently shows benchmark results led by MBPP (6 / 28, score 85.60), HumanEval (17 / 39, score 88.10), MATH (23 / 42, score 69.20).

Benchmark Results

Claude 3.5 Haiku

Benchmark Results

General Knowledge

4 evaluations

Benchmark / mode

Score

Rank/total

MMLU

77.60

54 / 65

MMLU Pro

101 / 126

GPQA Diamond

41.60

163 / 179

GPQA

37.50

12 / 14

Coding and Software Engineer

2 evaluations

Benchmark / mode

Score

Rank/total

HumanEval

88.10

17 / 39

MBPP

85.60

6 / 28

Math and Reasoning

2 evaluations

Benchmark / mode

Score

Rank/total

MATH

69.20

23 / 42

FrontierMath

0.30

57 / 60

Common Sense

1 evaluations

Benchmark / mode

Score

Rank/total

SimpleQA

8.02

42 / 45

Agent Level Benchmark

1 evaluations

Benchmark / mode

Score

Rank/total

Aider-Polyglot

Standard Mode

45 / 59

Compare with other models