GLM-5vsGLM-4.7

Across 9 shared benchmarks, GLM-5 leads overall: GLM-5 wins 7, GLM-4.7 wins 1, with 1 ties and an average score difference of +7.52.

智谱AI
GLM-5

智谱AI · 2026-02-11 · Chat model

智谱AI
GLM-4.7

智谱AI · 2025-12-22 · Chat model

GLM-57 wins(78%)Ties1(11%)1 winGLM-4.7

Benchmark scores

Grouped by capability, sorted by largest gap within each. 9 shared benchmarks.

Agent Level Benchmark

GLM-5 2/2
BenchmarkGLM-5GLM-4.7Diff
Terminal Bench Hard432 / 1333.307 / 13+9.70
τ²-Bench89.704 / 4087.406 / 40+2.30

General Knowledge

GLM-5 2/2
BenchmarkGLM-5GLM-4.7Diff
HLE50.4018 / 15742.8041 / 157+7.60
GPQA Diamond8643 / 178Thinking (No Tools)85.7044 / 178+0.30

Math and Reasoning

GLM-4.7 1/2
BenchmarkGLM-5GLM-4.7Diff
AIME 202692.707 / 14Thinking (No Tools)92.906 / 14-0.20
FrontierMath - Tier 42.1056 / 80Normal (No Tools)2.1056 / 80Normal (No Tools)

AI Agent - Information Search

GLM-5 1/1
BenchmarkGLM-5GLM-4.7Diff
BrowseComp75.9019 / 455234 / 45+23.90

AI Agent - Tool Usage

GLM-5 1/1
BenchmarkGLM-5GLM-4.7Diff
Terminal Bench 2.061.1018 / 464143 / 46+20.10

Coding and Software Engineer

GLM-5 1/1
BenchmarkGLM-5GLM-4.7Diff
SWE-bench Verified77.8023 / 108Thinking (No Tools)73.8039 / 108+4

Specs

FieldGLM-5GLM-4.7
Publisher智谱AI智谱AI
Release date2026-02-112025-12-22
Model typeChat modelChat model
ArchitectureMoEMoE
Parameters744B358B
Context length200K200K
Max output128K132072

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemGLM-5GLM-4.7
Text input$1 / 1M tokensNot public
Text output$3.2 / 1M tokensNot public
Cache write$0.2 / 1M tokensNot public

One or both models have incomplete public pricing.

Summary

  • GLM-5leads in:Agent Level Benchmark (2/2), General Knowledge (2/2), AI Agent - Information Search (1/1), AI Agent - Tool Usage (1/1), Coding and Software Engineer (1/1)
  • GLM-4.7leads in:Math and Reasoning (1/2)

On average across the 9 shared benchmarks, GLM-5 scores 7.52 higher.

Largest single-benchmark gap: BrowseComp — GLM-5 75.90 vs GLM-4.7 52 (+23.90).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.