GLM-5.2vsGLM-4.7

Across 4 shared benchmarks, GLM-5.2 leads overall: GLM-5.2 wins 4, GLM-4.7 wins 0, with 0 ties and an average score difference of +11.30.

智谱AI · 2026-06-13 · Reasoning model

智谱AI · 2025-12-22 · Chat model

GLM-5.24 wins(100%)(0%)0 winsGLM-4.7

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

GLM-5.2 2/2

Benchmark	GLM-5.2	GLM-4.7	Diff
HLE	54.708 / 159Thinking (With Tools)	42.8042 / 159	+11.90
GPQA Diamond	91.2015 / 179Thinking (No Tools)	85.7045 / 179	+5.50

GLM-5.2 1/1

Benchmark	GLM-5.2	GLM-4.7	Diff
SWE-Bench Pro - Public	62.105 / 44Thinking (With Tools)	40.6040 / 44	+21.50

GLM-5.2 1/1

Benchmark	GLM-5.2	GLM-4.7	Diff
AIME 2026	99.201 / 15Thinking (No Tools)	92.907 / 15	+6.30

Prices use DataLearner records when available; missing fields are not inferred.

One or both models have incomplete public pricing.

GLM-5.2leads in:General Knowledge (2/2), Coding and Software Engineer (1/1), Math and Reasoning (1/1)

On average across the 4 shared benchmarks, GLM-5.2 scores 11.30 higher.

Largest single-benchmark gap: SWE-Bench Pro - Public — GLM-5.2 62.10 vs GLM-4.7 40.60 (+21.50).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.