DataLearner logoDataLearnerAI
Latest AI Insights
Model Evaluations
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
HomeEvaluation OverviewArtificial Analysis Intelligence Index AI模型智能指数排行榜

Artificial Analysis Intelligence Index AI模型智能指数排行榜

Artificial Analysis Intelligence Index v4.0 综合了10项权威评测基准(GDPval-AA、Terminal-Bench、GPQA Diamond、SciCode等),从数学、科学、编程、推理等多维度对AI模型进行全面评估和排名。

Top Model

Gemini 3.1 Pro Preview

Top Score

57

Model Count

196

Data version

2026年03月26日

Data source: Artificial Analysis

Artificial Analysis Intelligence Index Ranking

Top 10

Chart Source: DataLearnerAI · Data Source: LMArena

Ranking Table

RankModelIntelligence IndexOrganization
1Gemini 3.1 Pro Preview57Google
2GPT-5.4 (xhigh)57OpenAI
3GPT-5.3 Codex (xhigh)54OpenAI
4Claude Opus 4.6 (max)53Anthropic
5Claude Sonnet 4.6 (max)52Anthropic
6GLM-550Z AI
7MiniMax-M2.750MiniMax
8MiMo-V2-Pro49Xiaomi
9Grok 4.20 Beta 030948xAI
10GPT-5.4 mini (xhigh)48OpenAI
11Kimi K2.547Kimi
12GLM-5-Turbo47Z AI
13Claude Opus 4.646Anthropic
14Gemini 3 Flash46Google
15Qwen3.5 397B A17B45Alibaba
16GPT-5.4 nano (xhigh)44OpenAI
17Claude Sonnet 4.644Anthropic
18MiMo-V2-Omni43Xiaomi
19Claude Sonnet 4.6 (Non-reasoning)43Anthropic
20Qwen3.5 27B42Alibaba
21DeepSeek V3.242DeepSeek
22Qwen3.5 122B A10B42Alibaba
23MiMo-V2-Flash (Feb 2026)41Xiaomi
24Gemini 3 Pro Preview (low)41Google
25GLM-541Z AI
26Qwen3.5 397B A17B40Alibaba
27Qwen3 Max Thinking40Alibaba
28o338OpenAI
29GPT-5.4 nano38OpenAI
30Step 3.5 Flash38StepFun
31GPT-5.4 mini (medium)38OpenAI
32Kimi K2.537Kimi
33Qwen3.5 27B37Alibaba
34Qwen3.5 35B A3B37Alibaba
35Claude 4.5 Haiku37Anthropic
36KAT-Coder-Pro V136KwaiKAT
37NVIDIA Nemotron 3 Super36NVIDIA
38Qwen3.5 122B A10B36Alibaba
39Nova 2.0 Pro Preview (medium)36Amazon
40GPT-5.435OpenAI
41Gemini 3 Flash35Google
42Gemini 2.5 Pro35Google
43Gemini 3.1 Flash-Lite Preview34Google
44Doubao Seed Code34ByteDance Seed
45gpt-oss-120B (high)33OpenAI
46Mercury 233Inception
47Qwen3.5 9B32Alibaba
48K-EXAONE32LG AI Research
49DeepSeek V3.232DeepSeek
50Grok 3 mini Reasoning (high)32xAI
51Nova 2.0 Pro Preview (low)32Amazon
52Claude 4.5 Haiku31Anthropic
53Qwen3.5 35B A3B31Alibaba
54MiMo-V2-Flash30Xiaomi
55Nova 2.0 Lite (medium)30Amazon
56Grok 4.20 Beta 030930xAI
57DeepSeek V3.2 Speciale29DeepSeek
58ERNIE 5.0 Thinking Preview29Baidu
59Grok Code Fast 129xAI
60Qwen3 Coder Next28Alibaba
61Nova 2.0 Omni (medium)28Amazon
62Apriel-v1.6-15B-Thinker28ServiceNow
63Qwen3.5 9B27Alibaba
64Magistral Medium 1.227Mistral
65Qwen3.5 4B27Alibaba
66DeepSeek R1 052827DeepSeek
67Mistral Small 427Mistral
68Qwen3 Next 80B A3B27Alibaba
69Qwen3 Coder 480B25Alibaba
70Nova 2.0 Lite (low)25Amazon
71gpt-oss-120B (low)24OpenAI
72gpt-oss-20B (high)24OpenAI
73GPT-5.4 nano24OpenAI
74NVIDIA Nemotron 3 Nano24NVIDIA
75K2 Think V224MBZUAI
76LongCat Flash Lite24LongCat
77HyperCLOVA X SEED Think24Naver
78GLM-4.6V23Z AI
79K-EXAONE23LG AI Research
80GPT-5.4 mini23OpenAI
81Nova 2.0 Omni (low)23Amazon
82Nova 2.0 Pro Preview23Amazon
83Mi:dm K 2.5 Pro23Korea Telecom
84Mistral Large 323Mistral
85Ring-1T23InclusionAI
86Qwen3.5 4B23Alibaba
87INTELLECT-322Prime Intellect
88Devstral 222Mistral
89Solar Open 100B22Upstage
90Gemini 2.5 Flash-Lite (Sep)22Google
91Mistral Medium 3.121Mistral
92gpt-oss-20B (low)21OpenAI
93K2-V2 (high)21MBZUAI
94Qwen3 Next 80B A3B20Alibaba
95Tri-21B-think Preview20Trillion Labs
96Devstral Small 219Mistral
97Gemini 2.5 Flash-Lite (Sep)19Google
98Motif-2-12.7B19Motif Technologies
99Ling-1T19InclusionAI
100Nova Premier19Amazon
101Llama Nemotron Super 49B19NVIDIA
102K2-V2 (medium)19MBZUAI
103Mistral Small 419Mistral
104Tri-21B-Think19Trillion Labs
105Hermes 4 405B19Nous Research
106Llama 3.3 Nemotron Super18NVIDIA
107Llama 4 Maverick18Meta
108Magistral Small 1.218Mistral
109Sarvam 105B (high)18Sarvam
110Nova 2.0 Lite18Amazon
111Hermes 4 405B18Nous Research
112Llama 3.1 405B17Meta
113GLM-4.6V17Z AI
114EXAONE 4.0 32B17LG AI Research
115Nova 2.0 Omni17Amazon
116DeepSeek R1 0528 Qwen3 8B16DeepSeek
117Qwen3.5 2B16Alibaba
118Nanbeige4.1-3B16Nanbeige
119Hermes 4 70B16Nous Research
120Ministral 3 14B16Mistral
121DeepSeek R1 Distill L70B16DeepSeek
122Falcon-H1R-7B16TII UAE
123Ling-flash-2.016InclusionAI
124Qwen3 Omni 30B A3B16Alibaba
125Step3 VL 10B15StepFun
126Llama Nemotron Ultra15NVIDIA
127ERNIE 4.5 300B A47B15Baidu
128Solar Pro 215Upstage
129NVIDIA Nemotron Nano 12B15NVIDIA
130Ministral 3 8B15Mistral
131NVIDIA Nemotron Nano 9B15NVIDIA
132NVIDIA Nemotron 3 Nano 4B15NVIDIA
133Qwen3.5 2B15Alibaba
134Llama Nemotron Super 49B15NVIDIA
135Llama 3.3 70B14Meta
136K2-V2 (low)14MBZUAI
137Llama 3.1 Nemotron Nano 4B14NVIDIA
138Kimi Linear 48B A3B14Kimi
139Llama 3.3 Nemotron Super14NVIDIA
140Ring-flash-2.014InclusionAI
141Olmo 3.1 32B Think14AI2
142Solar Pro 214Upstage
143Llama 4 Scout14Meta
144Command A13Cohere
145Llama 3.1 Nemotron 70B13NVIDIA
146NVIDIA Nemotron 3 Nano13NVIDIA
147NVIDIA Nemotron Nano 9B13NVIDIA
148Hermes 4 70B13Nous Research
149Sarvam 30B (high)12Sarvam
150Olmo 3.1 32B Instruct12AI2
151R1 177612Perplexity
152Llama 3.2 90B (Vision)12Meta
153EXAONE 4.0 32B12LG AI Research
154Ministral 3 3B11Mistral
155DeepHermes 3 - Mistral 24B11Nous Research
156Jamba 1.7 Large11AI21 Labs
157Granite 4.0 H Small11IBM
158Qwen3 Omni 30B A3B11Alibaba
159Qwen3.5 0.8B11Alibaba
160LFM2 24B A2B10Liquid AI
161Phi-410Microsoft
162Gemma 3 27B10Google
163Nova Micro10Amazon
164NVIDIA Nemotron Nano 12B10NVIDIA
165Phi-4 Multimodal10Microsoft
166Qwen3.5 0.8B10Alibaba
167Jamba Reasoning 3B10AI21 Labs
168Reka Flash 310Reka AI
169Olmo 3 7B Think9AI2
170Ling-mini-2.09InclusionAI
171Gemma 3 12B9Google
172Llama 3.2 11B (Vision)9Meta
173Phi-4 Mini8Microsoft
174Exaone 4.0 1.2B8LG AI Research
175Olmo 3 7B8AI2
176Exaone 4.0 1.2B8LG AI Research
177LFM2.5-1.2B-Thinking8Liquid AI
178Jamba 1.7 Mini8AI21 Labs
179LFM2.5-1.2B-Instruct8Liquid AI
180LFM2 2.6B8Liquid AI
181Granite 4.0 H 1B8IBM
182Gemma 3 270M8Google
183Apertus 70B Instruct8Swiss AI
184Granite 4.0 Micro8IBM
185DeepHermes 3 - Llama 8B8Nous Research
186Granite 4.0 1B7IBM
187Molmo2-8B7AI2
188LFM2 8B A1B7Liquid AI
189Gemma 3n E4B6Google
190Gemma 3 4B6Google
191LFM2.5-VL-1.6B6Liquid AI
192Granite 4.0 350M6IBM
193Apertus 8B Instruct6Swiss AI
194Gemma 3 1B6Google
195Granite 4.0 H 350M5IBM
196Gemma 3n E2B5Google

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

Benchmark Components (Intelligence Index v4.0)

The Intelligence Index aggregates 10 rigorous benchmarks to provide a holistic measure of AI capabilities, preventing narrow specialization.

GDPval-AA
Agentic real-world tasks
τ²-Bench
Agentic tool use
Terminal-Bench
Agentic coding
SciCode
Coding proficiency
AA-LCR
Long context reasoning
AA-Omniscience
Knowledge & hallucination
IFBench
Instruction following
Humanity's Last Exam
Reasoning & knowledge
GPQA Diamond
Scientific reasoning
CritPt
Physics reasoning

FAQ

What is the Artificial Analysis Intelligence Index?▼
The Artificial Analysis Intelligence Index v4.0 is a composite benchmark that aggregates performance across 10 challenging evaluations — spanning mathematics, science, coding, agentic tasks, and reasoning — to measure AI capabilities holistically. It is designed to prevent narrow specialization and provide a single score for tracking progress.
How is the Intelligence Index calculated?▼
The index aggregates scores from 10 benchmarks: GDPval-AA (agentic real-world tasks), τ²-Bench (tool use), Terminal-Bench Hard (agentic coding), SciCode (coding), AA-LCR (long context reasoning), AA-Omniscience (knowledge & hallucination), IFBench (instruction following), Humanity's Last Exam (reasoning), GPQA Diamond (scientific reasoning), and CritPt (physics). All tests are independently run by Artificial Analysis on standardized hardware.
How does this differ from LMArena?▼
LMArena rankings are based on crowdsourced user votes (Elo ratings from blind A/B tests), reflecting subjective human preferences. The Artificial Analysis Intelligence Index uses standardized automated benchmarks with objective scoring, measuring technical capabilities across specific domains. Both perspectives are valuable — LMArena captures real-world user experience, while AA Intelligence Index provides reproducible technical measurements.
Where can I find the original data?▼
The original leaderboard and detailed methodology are available at artificialanalysis.ai. The Intelligence Index methodology is documented at Intelligence Index page.