DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Page navigation
Page navigation
Model catalogGLM-4.5-AirBenchmark analysis

GLM-4.5-Air Benchmark Details

GLM-4.5-Air currently shows benchmark results led by MATH-500 (5 / 44, score 98.10), AIME 2024 (15 / 62, score 89.40), Pinch Bench (13 / 37, score 85.70).

Benchmark Results

GLM-4.5-Air

Benchmark Results

Thinking
Tool usage

综合评估

4 evaluations
Benchmark / mode
Score
Rank/total
MMLU Pro
Thinking Enabled
81.40
51 / 126
GPQA Diamond
Thinking Enabled
75
93 / 177
LiveBench
Standard Mode
60.53
41 / 52
HLE
Thinking Enabled
10.60
129 / 156

编程与软件工程

2 evaluations
Benchmark / mode
Score
Rank/total
LiveCodeBench
Thinking Enabled
70.70
49 / 120
SWE-bench Verified
Thinking Enabled
57.60
78 / 106

数学推理

2 evaluations
Benchmark / mode
Score
Rank/total
MATH-500
Thinking Enabled
98.10
5 / 44
AIME 2024
Thinking Enabled
89.40
15 / 62

AI Agent - 工具使用

1 evaluations
Benchmark / mode
Score
Rank/total
Terminal-Bench
Thinking Enabled
30
22 / 35

OpenClaw智能体能力综合测评

1 evaluations
Benchmark / mode
Score
Rank/total
Pinch Bench
Thinking EnabledTools
85.70
13 / 37
Compare with other models