GLM-5.2vsDeepSeek-V4-Pro

Across 4 shared benchmarks, GLM-5.2 leads overall: GLM-5.2 wins 4, DeepSeek-V4-Pro wins 0, with 0 ties and an average score difference of +32.75.

智谱AI
GLM-5.2

智谱AI · 2026-06-13 · Reasoning model

DeepSeek-AI
DeepSeek-V4-Pro

DeepSeek-AI · 2026-04-24 · Reasoning model

GLM-5.24 wins(100%)(0%)0 winsDeepSeek-V4-Pro

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

General Knowledge

GLM-5.2 2/2
BenchmarkGLM-5.2DeepSeek-V4-ProDiff
HLE54.708 / 159Thinking (With Tools)7.70143 / 159Normal (No Tools)+47
GPQA Diamond91.2015 / 179Thinking (No Tools)72.90103 / 179Normal (No Tools)+18.30

Coding and Software Engineer

GLM-5.2 1/1
BenchmarkGLM-5.2DeepSeek-V4-ProDiff
SWE-Bench Pro - Public62.105 / 44Thinking (With Tools)52.1029 / 44Normal (With Tools)+10

Math and Reasoning

GLM-5.2 1/1
BenchmarkGLM-5.2DeepSeek-V4-ProDiff
IMO-AnswerBench911 / 20Thinking (No Tools)35.3020 / 20Normal (No Tools)+55.70

Specs

FieldGLM-5.2DeepSeek-V4-Pro
Publisher智谱AIDeepSeek-AI
Release date2026-06-132026-04-24
Model typeReasoning modelReasoning model
ArchitectureMoEMoE
Parameters753.33B1.6T
Context length1M1M
Max output128K375K

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemGLM-5.2DeepSeek-V4-Pro
Text input$1.4 / 1M tokens$0.435 / 1M tokens
Text output$4.4 / 1M tokens$0.87 / 1M tokens
Cache read$0.26 / 1M tokens$0.87 / 1M tokens
Cache writeNot public$0.003625 / 1M tokens

Summary

  • GLM-5.2leads in:General Knowledge (2/2), Coding and Software Engineer (1/1), Math and Reasoning (1/1)

On average across the 4 shared benchmarks, GLM-5.2 scores 32.75 higher.

Largest single-benchmark gap: IMO-AnswerBench — GLM-5.2 91 vs DeepSeek-V4-Pro 35.30 (+55.70).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.