GLM 5.1 vs GLM-4.6 评测对比

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

GLM 5.1

智谱AI

Release: 2026-03-27
Context length: 200K
Parameters: 754 (act 40)
最大输出: 128,000 tokens

Model profile·Playground

Best overall

GLM 5.1 · 72.60

Best single

GLM 5.1 · GPQA Diamond 86.20

Modality coverage

GLM 5.1 · 1 modalities

Head to head

GLM 5.1

GLM-4.6

AheadTiedBehind

Benchmarks

Wins

Losses

+19.80

Average diff

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Filter: Best Available·2 modes · 3 Benchmark

图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

3 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.

Benchmark	GLM 5.1	GLM-4.6
GPQA Diamond 综合评估	86.20Thinking Enabled	82.90Thinking Enabled ｜ Tools
HLE 综合评估	52.30Thinking Enabled ｜ Tools	30.40Thinking Enabled ｜ Tools
BrowseComp AI Agent - 信息收集	79.30Thinking Enabled ｜ Tools	45.10Thinking Enabled ｜ Tools

API price comparison

Side-by-side input/output token pricing

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs	GLM 5.1智谱AI	GLM-4.6智谱AI
Core specsRelease	2026-03-27	2025-09-30
Context length	200K	200K
Parameters	754	3550
Active parameters	40	320
Max output	128000	131072
MoE	Yes	Yes
Supported modes	No mode data	常规模式（Non-Thinking Mode）思考模式（Thinking Mode）
LicenseCode Open Source	Closed Source	Closed Source
Weights Open Source	Closed Source	Closed Source
Commercial use	免费商用授权	免费商用授权
Modality supportText Input/Output	/	/
ResourcesPaper / report	GLM-5.1: Towards Long-Horizon Tasks	GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities