GLM-5.2vsDeepSeek-V4-Pro

Across 4 shared benchmarks, GLM-5.2 leads overall: GLM-5.2 wins 4, DeepSeek-V4-Pro wins 0, with 0 ties and an average score difference of +32.75.

智谱AI · 2026-06-13 · Reasoning model

DeepSeek-AI · 2026-04-24 · Reasoning model

GLM-5.24 wins(100%)(0%)0 winsDeepSeek-V4-Pro

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

GLM-5.2 2/2

Benchmark	GLM-5.2	DeepSeek-V4-Pro	Diff
HLE	54.708 / 159Thinking (With Tools)	7.70143 / 159Normal (No Tools)	+47
GPQA Diamond	91.2015 / 179Thinking (No Tools)	72.90103 / 179Normal (No Tools)	+18.30

GLM-5.2 1/1

Benchmark	GLM-5.2	DeepSeek-V4-Pro	Diff
SWE-Bench Pro - Public	62.105 / 44Thinking (With Tools)	52.1029 / 44Normal (With Tools)	+10

GLM-5.2 1/1

Benchmark	GLM-5.2	DeepSeek-V4-Pro	Diff
IMO-AnswerBench	911 / 20Thinking (No Tools)	35.3020 / 20Normal (No Tools)	+55.70

Prices use DataLearner records when available; missing fields are not inferred.

GLM-5.2leads in:General Knowledge (2/2), Coding and Software Engineer (1/1), Math and Reasoning (1/1)

On average across the 4 shared benchmarks, GLM-5.2 scores 32.75 higher.

Largest single-benchmark gap: IMO-AnswerBench — GLM-5.2 91 vs DeepSeek-V4-Pro 35.30 (+55.70).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.