DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
HomeOverall LeaderboardArtificial Analysis Intelligence Index

Artificial Analysis Intelligence Index

Artificial Analysis Intelligence Index aggregates multiple rigorous benchmarks to compare AI model intelligence across coding, reasoning, science, tool use, and agentic tasks.

Top Model

Kimi K2.6

Top Score

54

Model Count

212

Data version

2026年05月10日

Data source: Artificial Analysis

Origin:AllChina
Leaderboard snapshot month:

Ranking Table

RankModelIntelligence IndexOrganization
6Moonshot AIKimi K2.6Moonshot AI54Moonshot AI
14DeepSeek-AIDeepSeek-V4-Pro (max)DeepSeek-AI52DeepSeek-AI
18DeepSeek-AIDeepSeek-V4-Pro (high)DeepSeek-AI50DeepSeek-AI
20MiniMaxAIMiniMax-M2.7MiniMaxAI50MiniMaxAI
25DeepSeek-AIDeepSeek-V4-Flash (max)DeepSeek-AI47DeepSeek-AI
30DeepSeek-AIDeepSeek-V4-Flash (high)DeepSeek-AI45DeepSeek-AI
36Moonshot AIKimi K2.6Moonshot AI43Moonshot AI
39TencentHy3-previewTencent42Tencent
46DeepSeek-AIDeepSeek-V4-ProDeepSeek-AI39DeepSeek-AI
51StepFunAIStep 3.5 FlashStepFunAI38StepFunAI
55Moonshot AIKimi K2.5Moonshot AI37Moonshot AI
58DeepSeek-AIDeepSeek-V4-FlashDeepSeek-AI36DeepSeek-AI
67TencentHy3-previewTencent34Tencent
69ByteDance SeedDoubao Seed CodeByteDance Seed34ByteDance Seed
97AlibabaQwen3.5 4BAlibaba27Alibaba
98DeepSeek-AIDeepSeek-R1-0528DeepSeek-AI27DeepSeek-AI
118AlibabaQwen3.5 4BAlibaba23Alibaba
142AlibabaQwen3.5 2BAlibaba16Alibaba
145DeepSeek-AIDeepSeek-R1-Distill-Llama-70BDeepSeek-AI16DeepSeek-AI
149StepFunStep3 VL 10BStepFun15StepFun
160AlibabaQwen3.5 2BAlibaba15Alibaba
164KimiKimi Linear 48B A3B InstructKimi14Kimi
183AlibabaQwen3.5 0.8BAlibaba11Alibaba
189AlibabaQwen3.5 0.8BAlibaba10Alibaba

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

Benchmark Components (Intelligence Index v4.0)

The Intelligence Index aggregates 10 rigorous benchmarks to provide a holistic measure of AI capabilities, preventing narrow specialization.

GDPval-AA
Agentic real-world tasks
τ²-Bench
Agentic tool use
Terminal-Bench
Agentic coding
SciCode
Coding proficiency
AA-LCR
Long context reasoning
AA-Omniscience
Knowledge & hallucination
IFBench
Instruction following
Humanity's Last Exam
Reasoning & knowledge
GPQA Diamond
Scientific reasoning
CritPt
Physics reasoning

FAQ

What is the Artificial Analysis Intelligence Index?▼
The Artificial Analysis Intelligence Index v4.0 is a composite benchmark that aggregates performance across 10 challenging evaluations — spanning mathematics, science, coding, agentic tasks, and reasoning — to measure AI capabilities holistically. It is designed to prevent narrow specialization and provide a single score for tracking progress.
How is the Intelligence Index calculated?▼
The index aggregates scores from 10 benchmarks: GDPval-AA (agentic real-world tasks), τ²-Bench (tool use), Terminal-Bench Hard (agentic coding), SciCode (coding), AA-LCR (long context reasoning), AA-Omniscience (knowledge & hallucination), IFBench (instruction following), Humanity's Last Exam (reasoning), GPQA Diamond (scientific reasoning), and CritPt (physics). All tests are independently run by Artificial Analysis on standardized hardware.
How does this differ from LMArena?▼
LMArena rankings are based on crowdsourced user votes (Elo ratings from blind A/B tests), reflecting subjective human preferences. The Artificial Analysis Intelligence Index uses standardized automated benchmarks with objective scoring, measuring technical capabilities across specific domains. Both perspectives are valuable — LMArena captures real-world user experience, while AA Intelligence Index provides reproducible technical measurements.
Where can I find the original data?▼
The original leaderboard and detailed methodology are available at artificialanalysis.ai. The Intelligence Index methodology is documented at Intelligence Index page.