GLM-4-9B-Chat Benchmark Details

GLM-4-9B-Chat currently shows benchmark results led by GPQA (7 / 15, score 58.50), AIME 2024 (34 / 62, score 76.40), MMLU Pro (90 / 132, score 72.40).

Benchmark Results

GLM-4-9B-Chat

Benchmark Results

General Knowledge

2 evaluations

Benchmark / mode

Score

Rank/total

72.40

90 / 132

58.50

7 / 15

Math and Reasoning

1 evaluations

Benchmark / mode

Score

Rank/total

76.40

34 / 62

Coding and Software Engineer

1 evaluations

Benchmark / mode

Score

Rank/total

51.80

90 / 123

Compare with other models