Mistral-Small-3.1-24B-Instruct-2503 Benchmark Details

Mistral-Small-3.1-24B-Instruct-2503 currently shows benchmark results led by HumanEval (13 / 39, score 88.41), MATH (21 / 42, score 69.30), MBPP (15 / 28, score 74.71).

Benchmark Results

Mistral-Small-3.1-24B-Instruct-2503

Benchmark Results

General Knowledge

4 evaluations

Benchmark / mode

Score

Rank/total

MMLU

80.62

48 / 65

MMLU Pro

66.76

98 / 126

GPQA Diamond

45.96

160 / 179

GPQA

44.42

8 / 14

Coding and Software Engineer

2 evaluations

Benchmark / mode

Score

Rank/total

HumanEval

88.41

13 / 39

MBPP

74.71

15 / 28

Math and Reasoning

1 evaluations

Benchmark / mode

Score

Rank/total

MATH

69.30

21 / 42

Common Sense

1 evaluations

Benchmark / mode

Score

Rank/total

SimpleQA

10.43

39 / 45

Compare with other models