See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

GLM 5.1
智谱AI
Best overall
GLM 5.1 · 72.60
Best single
GLM 5.1 · GPQA Diamond 86.20
Modality coverage
GLM 5.1 · 1 modalities
Head to head
3
Benchmarks
3
Wins
0
Losses
+19.80
Average diff
Compare benchmark results across thinking modes and tool usage.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
Complete scores for each model/mode across selected benchmarks.
3 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | GLM 5.1 | GLM-4.6 |
|---|---|---|
GPQA Diamond 综合评估 | 86.20Thinking Enabled | 82.90Thinking Enabled | Tools |
HLE 综合评估 | 52.30Thinking Enabled | Tools | 30.40Thinking Enabled | Tools |
BrowseComp AI Agent - 信息收集 | 79.30Thinking Enabled | Tools | 45.10Thinking Enabled | Tools |
Side-by-side input/output token pricing
Licensing, MoE architecture, and multi-modality support.
| Features & specs | GLM 5.1智谱AI | GLM-4.6智谱AI |
|---|---|---|
Core specsRelease | 2026-03-27 | 2025-09-30 |
Context length | 200K | 200K |
Parameters | 754 | 3550 |
Active parameters | 40 | 320 |
Max output | 128000 | 131072 |
MoE | Yes | Yes |
Supported modes | No mode data | 常规模式(Non-Thinking Mode)思考模式(Thinking Mode) |
LicenseCode Open Source | Closed Source | Closed Source |
Weights Open Source | Closed Source | Closed Source |
Commercial use | 免费商用授权 | 免费商用授权 |
Modality supportText Input/Output | / | / |
ResourcesPaper / report | GLM-5.1: Towards Long-Horizon Tasks | GLM-4.6: Advanced Agentic, Reasoning and Coding Capabilities |

GLM-4.6
智谱AI