DataLearner logoDataLearnerAI
Latest AI Insights
Model Evaluations
Model Directory
Model Comparison
Resource Center
Tools

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
Loading comparison...
Table of Contents
目录
  1. Home
  2. Model Compare
  3. Results

大模型评测对比结果

See key specs and per-benchmark scores for each model/mode. Scroll horizontally for all columns. 当前对比 2 个模型的评测数据与核心参数。

GLM-5Kimi K2.5
规格对比
智谱AI

GLM-5

GL

GLM-5

Release2026-02-11
Context length200K
Parameters7440
常规模式(Non-Thinking Mode)思考模式(Thinking Mode)
Model profilePlayground
Moonshot AI

Kimi K2.5

KI

Kimi K2.5

Release2026-01-27
Context length256K
Parameters10000
常规模式(Non-Thinking Mode)思考模式(Thinking Mode)
Model profilePlayground

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Performance benchmarks

Compare benchmark results across thinking modes and tool usage.

All Modes · Exclude Parallel
View
Thinking Mode (Default)
Thinking Mode (Default) - Help
  • Default: Thinking Mode (Default) (Standard/Medium)
  • All: Thinking Mode (All)
All Tools
Parallel

Best Overall

GLM-5 · 67.56

Best Single

GLM-5 · AIME 2026 92.70

Thinking Mode (Default)

GLM-5 · 1 All Modes

Benchmark scores

Higher is usually better; “—” means no score.

Filter: All Modes · Exclude Parallel2 All Modes · 8 Benchmark
图表加载中...

Benchmark score table

Complete scores for each model/mode across selected benchmarks.

Benchmark scores

Higher is usually better; “—” means no score.

8 Benchmark2 All Modes
Supported modes:NormalThinkDeepToolParallel
Benchmark
GL
GLM-5智谱AI
KI
Kimi K2.5Moonshot AI
综合评估
GPQA Diamond
86.0087.60
HLE
30.5030.10
编程与软件工程
SWE-bench Verified
77.8076.80
AI Agent - 信息收集
BrowseComp
62.0060.60
数学推理
AIME 2026
92.7092.50
IMO-AnswerBench
82.5081.80
生产力知识
GDPval-AA
46.0040.00
长上下文能力
AA-LCR
63.0065.00

Feature compare

Detailed feature breakdown

Licensing, MoE architecture, and multi-modality support.

Features & specs
GL
GLM-5智谱AI
KI
Kimi K2.5Moonshot AI

Model snapshots

Organization
智谱AIMoonshot AI
模型全名
GLM-5Kimi K2.5
模型简介
Not providedNot provided
模型类型
聊天大模型多模态大模型
模型代号
glm-5kimi-k2-5
Release
2026-02-112026-01-27
MoE
YesYes

规格与性能

Context length
200K256K
Parameters
744010000
激活参数量
400320
模型规模
100b100b
模型大小
1.51TB595GB
推理速度
推理等级
最大输出
13107216384
Supported modes
常规模式(Non-Thinking Mode)思考模式(Thinking Mode)
常规模式(Non-Thinking Mode)思考模式(Thinking Mode)

开源与许可

Code Open Source
Not providedNot provided
Weights Open Source
Closed SourceNot provided
Commercial use
免费商用授权免费商用授权

Modality support

Text Input/Output
/
/
Image Input/Output
Not provided
/
Audio Input/Output
Not provided
/
Video Input/Output
Not provided
/
Embedding Input/Output
Not provided
/

API 接口详情

Text 价格
Input: $1 / 1M tokensOutput: $3.2 / 1M tokensCache: $0.2 / 1M tokens
Input: 0.6 美元/100 万tokensOutput: 3 美元/100 万tokensCache: 0.1 美元/100 万tokens
Image API pricing
Not provided
Input: 0.6 美元/100 万tokensCache: 0.1 美元/100 万tokens
Audio API pricing
Not providedNot provided
Video API pricing
Not providedNot provided
Embedding API pricing
Not providedNot provided

Resources

GitHub
RepoRepo
Hugging Face
Model PageModel Page
Official Page
Not providedNot provided
Guides
Not providedNot provided
Papers
GLM-5: From Vibe Coding to Agentic EngineeringKimi K2.5: Visual Agentic Intelligence
DataLearnerAI
Not provided重磅!Kimi K2.5发布,依然免费开源!原生多模态MoE架构,全球最大规模参数的开源模型之一,官方评测结果比肩诸多闭源模型!可以驱动100个子Agent执行!

API pricing

API price comparison

Side-by-side input/output token pricing

Higher is usually better; “—” means no score.