Artificial Analysis Intelligence Index
Artificial Analysis Intelligence Index aggregates multiple rigorous benchmarks to compare AI model intelligence across coding, reasoning, science, tool use, and agentic tasks.
Top Model
GPT-5.5 (xhigh)
Top Score
60
Model Count
212
Data version
2026年05月10日
Data source: Artificial Analysis
Ranking Table
| Rank | Model | Intelligence Index | Organization |
|---|---|---|---|
GPT-5.5 (xhigh)OpenAI | 60 | OpenAI | |
GPT-5.5 (high)OpenAI | 59 | OpenAI | |
Opus 4.7 (max)Anthropic | 57 | Anthropic | |
| 4 | Gemini 3.1 Pro PreviewGoogle Deep Mind | 57 | Google Deep Mind |
| 5 | GPT-5.5 (medium)OpenAI | 57 | OpenAI |
| 6 | Kimi K2.6Moonshot AI | 54 | Moonshot AI |
| 7 | MiMo-V2.5-ProXiaomi | 54 | Xiaomi |
| 8 | GPT-5.3 Codex (xhigh)OpenAI | 54 | OpenAI |
| 9 | 53 | xAI | |
| 10 | Muse SparkFacebook AI研究实验室 | 52 | Facebook AI研究实验室 |
| 11 | Opus 4.7 (high)Anthropic | 52 | Anthropic |
| 12 | Qwen3.6-Max-Preview阿里巴巴 | 52 | 阿里巴巴 |
| 13 | Claude Sonnet 4.6 (max)Anthropic | 52 | Anthropic |
| 14 | DeepSeek-V4-Pro (max)DeepSeek-AI | 52 | DeepSeek-AI |
| 15 | GLM 5.1智谱AI | 51 | 智谱AI |
| 16 | GPT-5.5 (low)OpenAI | 51 | OpenAI |
| 17 | Qwen 3.6 Plus Preview阿里巴巴 | 50 | 阿里巴巴 |
| 18 | DeepSeek-V4-Pro (high)DeepSeek-AI | 50 | DeepSeek-AI |
| 19 | GLM-5智谱AI | 50 | 智谱AI |
| 20 | 50 | MiniMaxAI | |
| 21 | MiMo-V2.5Xiaomi | 49 | Xiaomi |
| 22 | GPT-5.4 mini (xhigh)OpenAI | 49 | OpenAI |
| 23 | GPT-5.4 (low)OpenAI | 48 | OpenAI |
| 24 | GLM-5-Turbo智谱AI | 47 | 智谱AI |
| 25 | DeepSeek-V4-Flash (max)DeepSeek-AI | 47 | DeepSeek-AI |
| 26 | Gemini 3.0 FlashGoogle Deep Mind | 46 | Google Deep Mind |
| 27 | Qwen3.6-27B阿里巴巴 | 46 | 阿里巴巴 |
| 28 | Qwen3.5-397B-A17B阿里巴巴 | 45 | 阿里巴巴 |
| 29 | Nova 2 Omni(Preview)亚马逊 | 45 | 亚马逊 |
| 30 | DeepSeek-V4-Flash (high)DeepSeek-AI | 45 | DeepSeek-AI |
| 31 | Claude Sonnet 4.6 (non-reasoning)Anthropic | 44 | Anthropic |
| 32 | GPT-5.4 nano (xhigh)OpenAI | 44 | OpenAI |
| 33 | GLM 5.1智谱AI | 44 | 智谱AI |
| 34 | Qwen3.6-35B-A3B阿里巴巴 | 43 | 阿里巴巴 |
| 35 | MiMo-V2-OmniXiaomi | 43 | Xiaomi |
| 36 | Kimi K2.6Moonshot AI | 43 | Moonshot AI |
| 37 | GLM-5V-Turbo智谱AI | 43 | 智谱AI |
| 38 | Claude Sonnet 4.6 (Non-reasoning, Low Effort)Anthropic | 43 | Anthropic |
| 39 | Hy3-previewTencent | 42 | Tencent |
| 40 | Qwen3.5-122B-A10B阿里巴巴 | 42 | 阿里巴巴 |
| 41 | Gemini 2.0 Flash ExperimentalDeepMind | 41 | DeepMind |
| 42 | Gemini 3.1 Pro Preview (low)Google Deep Mind | 41 | Google Deep Mind |
| 43 | GPT-5.5 (non-reasoning)OpenAI | 41 | OpenAI |
| 44 | GLM-5智谱AI | 41 | 智谱AI |
| 45 | Qwen3.5-397B-A17B阿里巴巴 | 40 | 阿里巴巴 |
| 46 | DeepSeek-V4-ProDeepSeek-AI | 39 | DeepSeek-AI |
| 47 | Mistral Medium 3.5Mistral | 39 | Mistral |
| 48 | Gemma 4 31BDeepMind | 39 | DeepMind |
| 49 | Qwen3.5-Omni-Plus阿里巴巴 | 39 | 阿里巴巴 |
| 50 | 39 | xAI | |
| 51 | Step 3.5 FlashStepFunAI | 38 | StepFunAI |
| 52 | OpenAI o3OpenAI | 38 | OpenAI |
| 53 | GPT-5.4 nanoOpenAI | 38 | OpenAI |
| 54 | GPT-5.4 mini (medium)OpenAI | 38 | OpenAI |
| 55 | Kimi K2.5Moonshot AI | 37 | Moonshot AI |
| 56 | Qwen3.6-27B阿里巴巴 | 37 | 阿里巴巴 |
| 57 | Haiku 4.5Anthropic | 37 | Anthropic |
| 58 | DeepSeek-V4-FlashDeepSeek-AI | 36 | DeepSeek-AI |
| 59 | NVIDIA Nemotron 3 SuperNVIDIA | 36 | NVIDIA |
| 60 | Qwen3.5-122B-A10B阿里巴巴 | 36 | 阿里巴巴 |
| 61 | Nova 2 Pro(Preview) (medium)亚马逊 | 36 | 亚马逊 |
| 62 | MiMo-V2.5-ProXiaomi | 36 | Xiaomi |
| 63 | GPT-5.4 (non-reasoning)OpenAI | 35 | OpenAI |
| 64 | Gemini 3.0 FlashGoogle Deep Mind | 35 | Google Deep Mind |
| 65 | Gemini 2.5-ProGoogle Deep Mind | 35 | Google Deep Mind |
| 66 | Nova 2 Lite (high)亚马逊 | 35 | 亚马逊 |
| 67 | Hy3-previewTencent | 34 | Tencent |
| 68 | Ling-2.6-1TInclusionAI | 34 | InclusionAI |
| 69 | Doubao Seed CodeByteDance Seed | 34 | ByteDance Seed |
| 70 | Gemini 3.1 Flash-Lite PreviewGoogle | 34 | |
| 71 | GPT OSS 120B (high)OpenAI | 33 | OpenAI |
| 72 | Mercury 2Inception | 33 | Inception |
| 73 | Qwen3.5-9B-Instruct阿里巴巴 | 32 | 阿里巴巴 |
| 74 | Gemma 4 31BDeepMind | 32 | DeepMind |
| 75 | K-EXAONELG AI Research | 32 | LG AI Research |
| 76 | 32 | xAI | |
| 77 | Nova 2 Pro(Preview) (low)亚马逊 | 32 | 亚马逊 |
| 78 | Trinity Large ThinkingArcee AI | 32 | Arcee AI |
| 79 | Qwen3.6-35B-A3B阿里巴巴 | 32 | 阿里巴巴 |
| 80 | Gemma 4 26B A4BDeepMind | 31 | DeepMind |
| 81 | Haiku 4.5Anthropic | 31 | Anthropic |
| 82 | 31 | xAI | |
| 83 | Qwen3.5-35B-A3B阿里巴巴 | 31 | 阿里巴巴 |
| 84 | MiMo-V2-FlashXiaomi | 30 | Xiaomi |
| 85 | EXAONE 4.5 33BLG AI Research | 30 | LG AI Research |
| 86 | Nova 2 Lite (medium)亚马逊 | 30 | 亚马逊 |
| 87 | ERNIE 5.0百度 | 29 | 百度 |
| 88 | 29 | xAI | |
| 89 | 29 | xAI | |
| 90 | Nemotron Cascade 2 30B A3BNVIDIA | 28 | NVIDIA |
| 91 | Qwen3-Coder-Next阿里巴巴 | 28 | 阿里巴巴 |
| 92 | Nova 2 Omni(Preview) (medium)亚马逊 | 28 | 亚马逊 |
| 93 | Mistral Small 4Mistral | 28 | Mistral |
| 94 | Qwen3.5-9B-Instruct阿里巴巴 | 27 | 阿里巴巴 |
| 95 | Magistral Medium 1.2Mistral | 27 | Mistral |
| 96 | Gemma 4 26B A4BDeepMind | 27 | DeepMind |
| 97 | Qwen3.5 4BAlibaba | 27 | Alibaba |
| 98 | DeepSeek-R1-0528DeepSeek-AI | 27 | DeepSeek-AI |
| 99 | Qwen3-Next阿里巴巴 | 27 | 阿里巴巴 |
| 100 | Ling 2.6 FlashInclusionAI | 26 | InclusionAI |
| 101 | Qwen3.5-Omni-Flash阿里巴巴 | 26 | 阿里巴巴 |
| 102 | Solar Pro 3Upstage | 26 | Upstage |
| 103 | JT-MINIChina Mobile | 25 | China Mobile |
| 104 | Nova 2 Lite (low)亚马逊 | 25 | 亚马逊 |
| 105 | GPT OSS 20B (high)OpenAI | 24 | OpenAI |
| 106 | GPT OSS 120B (low)OpenAI | 24 | OpenAI |
| 107 | GPT-5.4 nanoOpenAI | 24 | OpenAI |
| 108 | NVIDIA Nemotron 3 NanoNVIDIA | 24 | NVIDIA |
| 109 | LongCat Flash LiteLongCat | 24 | LongCat |
| 110 | 24 | xAI | |
| 111 | K-EXAONELG AI Research | 23 | LG AI Research |
| 112 | GPT-5.4 miniOpenAI | 23 | OpenAI |
| 113 | Nova 2 Omni(Preview) (low)亚马逊 | 23 | 亚马逊 |
| 114 | Nova 2 Pro(Preview)亚马逊 | 23 | 亚马逊 |
| 115 | Mi:dm K 2.5 ProKorea Telecom | 23 | Korea Telecom |
| 116 | Mistral Large 3MistralAI | 23 | MistralAI |
| 117 | Ring-1TInclusionAI | 23 | InclusionAI |
| 118 | Qwen3.5 4BAlibaba | 23 | Alibaba |
| 119 | INTELLECT-3Prime Intellect | 22 | Prime Intellect |
| 120 | Devstral 2Mistral | 22 | Mistral |
| 121 | Solar Open 100BUpstage | 22 | Upstage |
| 122 | Gemini 2.5 Flash-Lite-Preview-09-2025Google Deep Mind | 22 | Google Deep Mind |
| 123 | Nemotron 3 Nano Omni 30B A3B ReasoningNVIDIA | 21 | NVIDIA |
| 124 | GPT OSS 20B (low)OpenAI | 21 | OpenAI |
| 125 | Qwen3-Next阿里巴巴 | 20 | 阿里巴巴 |
| 126 | Devstral Small 2Mistral | 19 | Mistral |
| 127 | Gemini 2.5 Flash-Lite-Preview-09-2025Google Deep Mind | 19 | Google Deep Mind |
| 128 | Motif-2-12.7BMotif Technologies | 19 | Motif Technologies |
| 129 | Ling-1TInclusionAI | 19 | InclusionAI |
| 130 | Nova PremierAmazon | 19 | Amazon |
| 131 | Gemma 4 E4BDeepMind | 19 | DeepMind |
| 132 | Llama Nemotron Super 49B v1.5Meta | 19 | Meta |
| 133 | Mistral Small 4Mistral | 19 | Mistral |
| 134 | Llama 3.3 Nemotron Super 49BMeta | 18 | Meta |
| 135 | Llama 4 MaverickFacebook AI研究实验室 | 18 | Facebook AI研究实验室 |
| 136 | Sarvam 105B (high)Sarvam | 18 | Sarvam |
| 137 | Magistral Small 1.2Mistral | 18 | Mistral |
| 138 | Nova 2 Lite亚马逊 | 18 | 亚马逊 |
| 139 | Llama3.1-405BFacebook AI研究实验室 | 17 | Facebook AI研究实验室 |
| 140 | EXAONE 4.0 32BLG AI Research | 17 | LG AI Research |
| 141 | Nova 2 Omni(Preview)亚马逊 | 17 | 亚马逊 |
| 142 | Qwen3.5 2BAlibaba | 16 | Alibaba |
| 143 | Nanbeige4.1-3BNanbeige | 16 | Nanbeige |
| 144 | Ministral 3 14BMistralAI | 16 | MistralAI |
| 145 | DeepSeek-R1-Distill-Llama-70BDeepSeek-AI | 16 | DeepSeek-AI |
| 146 | Falcon-H1R-7BTII UAE | 16 | TII UAE |
| 147 | Ling-flash-2.0InclusionAI | 16 | InclusionAI |
| 148 | Qwen3-Omni-30B-A3B阿里巴巴 | 16 | 阿里巴巴 |
| 149 | Step3 VL 10BStepFun | 15 | StepFun |
| 150 | Gemma 4 E2BDeepMind | 15 | DeepMind |
| 151 | Llama Nemotron UltraNVIDIA | 15 | NVIDIA |
| 152 | ERNIE-4.5-300B-A47B百度 | 15 | 百度 |
| 153 | Solar Pro 2Upstage | 15 | Upstage |
| 154 | NVIDIA Nemotron Nano 12B v2 VLNVIDIA | 15 | NVIDIA |
| 155 | Ministral 3 8BMistralAI | 15 | MistralAI |
| 156 | Gemma 4 E4BDeepMind | 15 | DeepMind |
| 157 | NVIDIA Nemotron Nano 9B V2NVIDIA | 15 | NVIDIA |
| 158 | Granite 4.1 30BIBM | 15 | IBM |
| 159 | NVIDIA Nemotron 3 Nano 4BNVIDIA | 15 | NVIDIA |
| 160 | Qwen3.5 2BAlibaba | 15 | Alibaba |
| 161 | Llama Nemotron Super 49B v1.5Meta | 15 | Meta |
| 162 | Llama3.3-70B-InstructFacebook AI研究实验室 | 14 | Facebook AI研究实验室 |
| 163 | Llama 3.1 Nemotron Nano 4B v1.1Meta | 14 | Meta |
| 164 | Kimi Linear 48B A3B InstructKimi | 14 | Kimi |
| 165 | Llama 3.3 Nemotron Super 49BMeta | 14 | Meta |
| 166 | Ring-flash-2.0InclusionAI | 14 | InclusionAI |
| 167 | Solar Pro 2Upstage | 14 | Upstage |
| 168 | Llama 4 ScoutFacebook AI研究实验室 | 14 | Facebook AI研究实验室 |
| 169 | C4AI Command A (202503)CohereAI | 13 | CohereAI |
| 170 | Llama 3.1 Nemotron 70BNVIDIA | 13 | NVIDIA |
| 171 | NVIDIA Nemotron 3 NanoNVIDIA | 13 | NVIDIA |
| 172 | NVIDIA Nemotron Nano 9B V2NVIDIA | 13 | NVIDIA |
| 173 | Granite 4.1 8BIBM | 12 | IBM |
| 174 | Sarvam 30B (high)Sarvam | 12 | Sarvam |
| 175 | Gemma 4 E2BDeepMind | 12 | DeepMind |
| 176 | R1 1776Perplexity | 12 | Perplexity |
| 177 | Llama 3.2-Vision-90BFacebook AI研究实验室 | 12 | Facebook AI研究实验室 |
| 178 | EXAONE 4.0 32BLG AI Research | 12 | LG AI Research |
| 179 | Ministral 3 3BMistral | 11 | Mistral |
| 180 | Jamba 1.7 LargeAI21 Labs | 11 | AI21 Labs |
| 181 | Granite 4.0 H SmallIBM | 11 | IBM |
| 182 | Qwen3-Omni-30B-A3B阿里巴巴 | 11 | 阿里巴巴 |
| 183 | Qwen3.5 0.8BAlibaba | 11 | Alibaba |
| 184 | LFM2 24B A2BLiquid AI | 10 | Liquid AI |
| 185 | Phi 4 - 14BMicrosoft Azure | 10 | Microsoft Azure |
| 186 | Amazon Nova Micro亚马逊 | 10 | 亚马逊 |
| 187 | NVIDIA Nemotron Nano 12B v2 VLNVIDIA | 10 | NVIDIA |
| 188 | Phi-4-multimodal-instruct Microsoft Azure | 10 | Microsoft Azure |
| 189 | Qwen3.5 0.8BAlibaba | 10 | Alibaba |
| 190 | Jamba Reasoning 3BAI21 Labs | 10 | AI21 Labs |
| 191 | Gemini 3.0 FlashGoogle Deep Mind | 10 | Google Deep Mind |
| 192 | Ling-mini-2.0InclusionAI | 9 | InclusionAI |
| 193 | Llama 3.2-Vision-11BFacebook AI研究实验室 | 9 | Facebook AI研究实验室 |
| 194 | Granite 4.1 3BIBM | 9 | IBM |
| 195 | Phi-4-mini-instruct (3.8B)Microsoft Azure | 8 | Microsoft Azure |
| 196 | Exaone 4.0 1.2BLG AI Research | 8 | LG AI Research |
| 197 | Exaone 4.0 1.2BLG AI Research | 8 | LG AI Research |
| 198 | LFM2.5-1.2B-ThinkingLiquid AI | 8 | Liquid AI |
| 199 | Jamba 1.7 MiniAI21 Labs | 8 | AI21 Labs |
| 200 | LFM2.5-1.2B-InstructLiquid AI | 8 | Liquid AI |
| 201 | LFM2 2.6BLiquid AI | 8 | Liquid AI |
| 202 | Granite 4.0 H 1BIBM | 8 | IBM |
| 203 | Gemma 3-270MGoogle Deep Mind | 8 | Google Deep Mind |
| 204 | Apertus 70B InstructSwiss AI | 8 | Swiss AI |
| 205 | Granite 4.0 MicroIBM | 8 | IBM |
| 206 | Granite 4.0 1BIBM | 7 | IBM |
| 207 | LFM2 8B A1BLiquid AI | 7 | Liquid AI |
| 208 | LFM2.5-VL-1.6BLiquid AI | 6 | Liquid AI |
| 209 | Granite 4.0 350MIBM | 6 | IBM |
| 210 | Apertus 8B InstructSwiss AI | 6 | Swiss AI |
| 211 | Granite 4.0 H 350MIBM | 5 | IBM |
| 212 | Tiny Aya GlobalCohere | 5 | Cohere |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
Benchmark Components (Intelligence Index v4.0)
The Intelligence Index aggregates 10 rigorous benchmarks to provide a holistic measure of AI capabilities, preventing narrow specialization.
GDPval-AA
Agentic real-world tasks
τ²-Bench
Agentic tool use
Terminal-Bench
Agentic coding
SciCode
Coding proficiency
AA-LCR
Long context reasoning
AA-Omniscience
Knowledge & hallucination
IFBench
Instruction following
Humanity's Last Exam
Reasoning & knowledge
GPQA Diamond
Scientific reasoning
CritPt
Physics reasoning
FAQ
What is the Artificial Analysis Intelligence Index?▼
The Artificial Analysis Intelligence Index v4.0 is a composite benchmark that aggregates performance across 10 challenging evaluations — spanning mathematics, science, coding, agentic tasks, and reasoning — to measure AI capabilities holistically. It is designed to prevent narrow specialization and provide a single score for tracking progress.
How is the Intelligence Index calculated?▼
The index aggregates scores from 10 benchmarks: GDPval-AA (agentic real-world tasks), τ²-Bench (tool use), Terminal-Bench Hard (agentic coding), SciCode (coding), AA-LCR (long context reasoning), AA-Omniscience (knowledge & hallucination), IFBench (instruction following), Humanity's Last Exam (reasoning), GPQA Diamond (scientific reasoning), and CritPt (physics). All tests are independently run by Artificial Analysis on standardized hardware.
How does this differ from LMArena?▼
LMArena rankings are based on crowdsourced user votes (Elo ratings from blind A/B tests), reflecting subjective human preferences. The Artificial Analysis Intelligence Index uses standardized automated benchmarks with objective scoring, measuring technical capabilities across specific domains. Both perspectives are valuable — LMArena captures real-world user experience, while AA Intelligence Index provides reproducible technical measurements.
Where can I find the original data?▼
The original leaderboard and detailed methodology are available at artificialanalysis.ai. The Intelligence Index methodology is documented at Intelligence Index page.















