GLM-4-9B-Chat Benchmark Details

GLM-4-9B-Chat currently shows benchmark results led by GPQA (6 / 14, score 58.50), AIME 2024 (34 / 62, score 76.40), MMLU Pro (87 / 126, score 72.40).

Benchmark Results

GLM-4-9B-Chat

Benchmark Results

Thinking

General Knowledge

2 evaluations
Benchmark / mode
Score
Rank/total
72.40
87 / 126
58.50
6 / 14

Math and Reasoning

1 evaluations
Benchmark / mode
Score
Rank/total
76.40
34 / 62

Coding and Software Engineer

1 evaluations
Benchmark / mode
Score
Rank/total
51.80
88 / 120