Artificial Analysis Intelligence Index AI模型智能指数排行榜
Artificial Analysis Intelligence Index v4.0 综合了10项权威评测基准(GDPval-AA、Terminal-Bench、GPQA Diamond、SciCode等),从数学、科学、编程、推理等多维度对AI模型进行全面评估和排名。
榜首模型
GPT-5.5 (xhigh)
最高得分
60
模型数量
212
数据版本
2026年05月10日
数据来源: Artificial Analysis
排名总表
| 排名 | 模型名称 | 智能指数 | 机构 |
|---|---|---|---|
GPT-5.5 (xhigh)OpenAI | 60 | OpenAI | |
GPT-5.5 (high)OpenAI | 59 | OpenAI | |
Opus 4.7 (max)Anthropic | 57 | Anthropic | |
| 4 | Gemini 3.1 Pro PreviewGoogle Deep Mind | 57 | Google Deep Mind |
| 5 | GPT-5.5 (medium)OpenAI | 57 | OpenAI |
| 6 | Kimi K2.6Moonshot AI | 54 | Moonshot AI |
| 7 | MiMo-V2.5-ProXiaomi | 54 | Xiaomi |
| 8 | GPT-5.3 Codex (xhigh)OpenAI | 54 | OpenAI |
| 9 | 53 | xAI | |
| 10 | Muse SparkFacebook AI研究实验室 | 52 | Facebook AI研究实验室 |
| 11 | Opus 4.7 (high)Anthropic | 52 | Anthropic |
| 12 | Qwen3.6-Max-Preview阿里巴巴 | 52 | 阿里巴巴 |
| 13 | Claude Sonnet 4.6 (max)Anthropic | 52 | Anthropic |
| 14 | DeepSeek-V4-Pro (max)DeepSeek-AI | 52 | DeepSeek-AI |
| 15 | GLM 5.1智谱AI | 51 | 智谱AI |
| 16 | GPT-5.5 (low)OpenAI | 51 | OpenAI |
| 17 | Qwen 3.6 Plus Preview阿里巴巴 | 50 | 阿里巴巴 |
| 18 | DeepSeek-V4-Pro (high)DeepSeek-AI | 50 | DeepSeek-AI |
| 19 | GLM-5智谱AI | 50 | 智谱AI |
| 20 | 50 | MiniMaxAI | |
| 21 | MiMo-V2.5Xiaomi | 49 | Xiaomi |
| 22 | GPT-5.4 mini (xhigh)OpenAI | 49 | OpenAI |
| 23 | GPT-5.4 (low)OpenAI | 48 | OpenAI |
| 24 | GLM-5-Turbo智谱AI | 47 | 智谱AI |
| 25 | DeepSeek-V4-Flash (max)DeepSeek-AI | 47 | DeepSeek-AI |
| 26 | Gemini 3.0 FlashGoogle Deep Mind | 46 | Google Deep Mind |
| 27 | Qwen3.6-27B阿里巴巴 | 46 | 阿里巴巴 |
| 28 | Qwen3.5-397B-A17B阿里巴巴 | 45 | 阿里巴巴 |
| 29 | Nova 2 Omni(Preview)亚马逊 | 45 | 亚马逊 |
| 30 | DeepSeek-V4-Flash (high)DeepSeek-AI | 45 | DeepSeek-AI |
| 31 | Claude Sonnet 4.6 (non-reasoning)Anthropic | 44 | Anthropic |
| 32 | GPT-5.4 nano (xhigh)OpenAI | 44 | OpenAI |
| 33 | GLM 5.1智谱AI | 44 | 智谱AI |
| 34 | Qwen3.6-35B-A3B阿里巴巴 | 43 | 阿里巴巴 |
| 35 | MiMo-V2-OmniXiaomi | 43 | Xiaomi |
| 36 | Kimi K2.6Moonshot AI | 43 | Moonshot AI |
| 37 | GLM-5V-Turbo智谱AI | 43 | 智谱AI |
| 38 | Claude Sonnet 4.6 (Non-reasoning, Low Effort)Anthropic | 43 | Anthropic |
| 39 | Hy3-previewTencent | 42 | Tencent |
| 40 | Qwen3.5-122B-A10B阿里巴巴 | 42 | 阿里巴巴 |
| 41 | Gemini 2.0 Flash ExperimentalDeepMind | 41 | DeepMind |
| 42 | Gemini 3.1 Pro Preview (low)Google Deep Mind | 41 | Google Deep Mind |
| 43 | GPT-5.5 (non-reasoning)OpenAI | 41 | OpenAI |
| 44 | GLM-5智谱AI | 41 | 智谱AI |
| 45 | Qwen3.5-397B-A17B阿里巴巴 | 40 | 阿里巴巴 |
| 46 | DeepSeek-V4-ProDeepSeek-AI | 39 | DeepSeek-AI |
| 47 | Mistral Medium 3.5Mistral | 39 | Mistral |
| 48 | Gemma 4 31BDeepMind | 39 | DeepMind |
| 49 | Qwen3.5-Omni-Plus阿里巴巴 | 39 | 阿里巴巴 |
| 50 | 39 | xAI | |
| 51 | Step 3.5 FlashStepFunAI | 38 | StepFunAI |
| 52 | OpenAI o3OpenAI | 38 | OpenAI |
| 53 | GPT-5.4 nanoOpenAI | 38 | OpenAI |
| 54 | GPT-5.4 mini (medium)OpenAI | 38 | OpenAI |
| 55 | Kimi K2.5Moonshot AI | 37 | Moonshot AI |
| 56 | Qwen3.6-27B阿里巴巴 | 37 | 阿里巴巴 |
| 57 | Haiku 4.5Anthropic | 37 | Anthropic |
| 58 | DeepSeek-V4-FlashDeepSeek-AI | 36 | DeepSeek-AI |
| 59 | NVIDIA Nemotron 3 SuperNVIDIA | 36 | NVIDIA |
| 60 | Qwen3.5-122B-A10B阿里巴巴 | 36 | 阿里巴巴 |
| 61 | Nova 2 Pro(Preview) (medium)亚马逊 | 36 | 亚马逊 |
| 62 | MiMo-V2.5-ProXiaomi | 36 | Xiaomi |
| 63 | GPT-5.4 (non-reasoning)OpenAI | 35 | OpenAI |
| 64 | Gemini 3.0 FlashGoogle Deep Mind | 35 | Google Deep Mind |
| 65 | Gemini 2.5-ProGoogle Deep Mind | 35 | Google Deep Mind |
| 66 | Nova 2 Lite (high)亚马逊 | 35 | 亚马逊 |
| 67 | Hy3-previewTencent | 34 | Tencent |
| 68 | Ling-2.6-1TInclusionAI | 34 | InclusionAI |
| 69 | Doubao Seed CodeByteDance Seed | 34 | ByteDance Seed |
| 70 | Gemini 3.1 Flash-Lite PreviewGoogle | 34 | |
| 71 | GPT OSS 120B (high)OpenAI | 33 | OpenAI |
| 72 | Mercury 2Inception | 33 | Inception |
| 73 | Qwen3.5-9B-Instruct阿里巴巴 | 32 | 阿里巴巴 |
| 74 | Gemma 4 31BDeepMind | 32 | DeepMind |
| 75 | K-EXAONELG AI Research | 32 | LG AI Research |
| 76 | 32 | xAI | |
| 77 | Nova 2 Pro(Preview) (low)亚马逊 | 32 | 亚马逊 |
| 78 | Trinity Large ThinkingArcee AI | 32 | Arcee AI |
| 79 | Qwen3.6-35B-A3B阿里巴巴 | 32 | 阿里巴巴 |
| 80 | Gemma 4 26B A4BDeepMind | 31 | DeepMind |
| 81 | Haiku 4.5Anthropic | 31 | Anthropic |
| 82 | 31 | xAI | |
| 83 | Qwen3.5-35B-A3B阿里巴巴 | 31 | 阿里巴巴 |
| 84 | MiMo-V2-FlashXiaomi | 30 | Xiaomi |
| 85 | EXAONE 4.5 33BLG AI Research | 30 | LG AI Research |
| 86 | Nova 2 Lite (medium)亚马逊 | 30 | 亚马逊 |
| 87 | ERNIE 5.0百度 | 29 | 百度 |
| 88 | 29 | xAI | |
| 89 | 29 | xAI | |
| 90 | Nemotron Cascade 2 30B A3BNVIDIA | 28 | NVIDIA |
| 91 | Qwen3-Coder-Next阿里巴巴 | 28 | 阿里巴巴 |
| 92 | Nova 2 Omni(Preview) (medium)亚马逊 | 28 | 亚马逊 |
| 93 | Mistral Small 4Mistral | 28 | Mistral |
| 94 | Qwen3.5-9B-Instruct阿里巴巴 | 27 | 阿里巴巴 |
| 95 | Magistral Medium 1.2Mistral | 27 | Mistral |
| 96 | Gemma 4 26B A4BDeepMind | 27 | DeepMind |
| 97 | Qwen3.5 4BAlibaba | 27 | Alibaba |
| 98 | DeepSeek-R1-0528DeepSeek-AI | 27 | DeepSeek-AI |
| 99 | Qwen3-Next阿里巴巴 | 27 | 阿里巴巴 |
| 100 | Ling 2.6 FlashInclusionAI | 26 | InclusionAI |
| 101 | Qwen3.5-Omni-Flash阿里巴巴 | 26 | 阿里巴巴 |
| 102 | Solar Pro 3Upstage | 26 | Upstage |
| 103 | JT-MINIChina Mobile | 25 | China Mobile |
| 104 | Nova 2 Lite (low)亚马逊 | 25 | 亚马逊 |
| 105 | GPT OSS 20B (high)OpenAI | 24 | OpenAI |
| 106 | GPT OSS 120B (low)OpenAI | 24 | OpenAI |
| 107 | GPT-5.4 nanoOpenAI | 24 | OpenAI |
| 108 | NVIDIA Nemotron 3 NanoNVIDIA | 24 | NVIDIA |
| 109 | LongCat Flash LiteLongCat | 24 | LongCat |
| 110 | 24 | xAI | |
| 111 | K-EXAONELG AI Research | 23 | LG AI Research |
| 112 | GPT-5.4 miniOpenAI | 23 | OpenAI |
| 113 | Nova 2 Omni(Preview) (low)亚马逊 | 23 | 亚马逊 |
| 114 | Nova 2 Pro(Preview)亚马逊 | 23 | 亚马逊 |
| 115 | Mi:dm K 2.5 ProKorea Telecom | 23 | Korea Telecom |
| 116 | Mistral Large 3MistralAI | 23 | MistralAI |
| 117 | Ring-1TInclusionAI | 23 | InclusionAI |
| 118 | Qwen3.5 4BAlibaba | 23 | Alibaba |
| 119 | INTELLECT-3Prime Intellect | 22 | Prime Intellect |
| 120 | Devstral 2Mistral | 22 | Mistral |
| 121 | Solar Open 100BUpstage | 22 | Upstage |
| 122 | Gemini 2.5 Flash-Lite-Preview-09-2025Google Deep Mind | 22 | Google Deep Mind |
| 123 | Nemotron 3 Nano Omni 30B A3B ReasoningNVIDIA | 21 | NVIDIA |
| 124 | GPT OSS 20B (low)OpenAI | 21 | OpenAI |
| 125 | Qwen3-Next阿里巴巴 | 20 | 阿里巴巴 |
| 126 | Devstral Small 2Mistral | 19 | Mistral |
| 127 | Gemini 2.5 Flash-Lite-Preview-09-2025Google Deep Mind | 19 | Google Deep Mind |
| 128 | Motif-2-12.7BMotif Technologies | 19 | Motif Technologies |
| 129 | Ling-1TInclusionAI | 19 | InclusionAI |
| 130 | Nova PremierAmazon | 19 | Amazon |
| 131 | Gemma 4 E4BDeepMind | 19 | DeepMind |
| 132 | Llama Nemotron Super 49B v1.5Meta | 19 | Meta |
| 133 | Mistral Small 4Mistral | 19 | Mistral |
| 134 | Llama 3.3 Nemotron Super 49BMeta | 18 | Meta |
| 135 | Llama 4 MaverickFacebook AI研究实验室 | 18 | Facebook AI研究实验室 |
| 136 | Sarvam 105B (high)Sarvam | 18 | Sarvam |
| 137 | Magistral Small 1.2Mistral | 18 | Mistral |
| 138 | Nova 2 Lite亚马逊 | 18 | 亚马逊 |
| 139 | Llama3.1-405BFacebook AI研究实验室 | 17 | Facebook AI研究实验室 |
| 140 | EXAONE 4.0 32BLG AI Research | 17 | LG AI Research |
| 141 | Nova 2 Omni(Preview)亚马逊 | 17 | 亚马逊 |
| 142 | Qwen3.5 2BAlibaba | 16 | Alibaba |
| 143 | Nanbeige4.1-3BNanbeige | 16 | Nanbeige |
| 144 | Ministral 3 14BMistralAI | 16 | MistralAI |
| 145 | DeepSeek-R1-Distill-Llama-70BDeepSeek-AI | 16 | DeepSeek-AI |
| 146 | Falcon-H1R-7BTII UAE | 16 | TII UAE |
| 147 | Ling-flash-2.0InclusionAI | 16 | InclusionAI |
| 148 | Qwen3-Omni-30B-A3B阿里巴巴 | 16 | 阿里巴巴 |
| 149 | Step3 VL 10BStepFun | 15 | StepFun |
| 150 | Gemma 4 E2BDeepMind | 15 | DeepMind |
| 151 | Llama Nemotron UltraNVIDIA | 15 | NVIDIA |
| 152 | ERNIE-4.5-300B-A47B百度 | 15 | 百度 |
| 153 | Solar Pro 2Upstage | 15 | Upstage |
| 154 | NVIDIA Nemotron Nano 12B v2 VLNVIDIA | 15 | NVIDIA |
| 155 | Ministral 3 8BMistralAI | 15 | MistralAI |
| 156 | Gemma 4 E4BDeepMind | 15 | DeepMind |
| 157 | NVIDIA Nemotron Nano 9B V2NVIDIA | 15 | NVIDIA |
| 158 | Granite 4.1 30BIBM | 15 | IBM |
| 159 | NVIDIA Nemotron 3 Nano 4BNVIDIA | 15 | NVIDIA |
| 160 | Qwen3.5 2BAlibaba | 15 | Alibaba |
| 161 | Llama Nemotron Super 49B v1.5Meta | 15 | Meta |
| 162 | Llama3.3-70B-InstructFacebook AI研究实验室 | 14 | Facebook AI研究实验室 |
| 163 | Llama 3.1 Nemotron Nano 4B v1.1Meta | 14 | Meta |
| 164 | Kimi Linear 48B A3B InstructKimi | 14 | Kimi |
| 165 | Llama 3.3 Nemotron Super 49BMeta | 14 | Meta |
| 166 | Ring-flash-2.0InclusionAI | 14 | InclusionAI |
| 167 | Solar Pro 2Upstage | 14 | Upstage |
| 168 | Llama 4 ScoutFacebook AI研究实验室 | 14 | Facebook AI研究实验室 |
| 169 | C4AI Command A (202503)CohereAI | 13 | CohereAI |
| 170 | Llama 3.1 Nemotron 70BNVIDIA | 13 | NVIDIA |
| 171 | NVIDIA Nemotron 3 NanoNVIDIA | 13 | NVIDIA |
| 172 | NVIDIA Nemotron Nano 9B V2NVIDIA | 13 | NVIDIA |
| 173 | Granite 4.1 8BIBM | 12 | IBM |
| 174 | Sarvam 30B (high)Sarvam | 12 | Sarvam |
| 175 | Gemma 4 E2BDeepMind | 12 | DeepMind |
| 176 | R1 1776Perplexity | 12 | Perplexity |
| 177 | Llama 3.2-Vision-90BFacebook AI研究实验室 | 12 | Facebook AI研究实验室 |
| 178 | EXAONE 4.0 32BLG AI Research | 12 | LG AI Research |
| 179 | Ministral 3 3BMistral | 11 | Mistral |
| 180 | Jamba 1.7 LargeAI21 Labs | 11 | AI21 Labs |
| 181 | Granite 4.0 H SmallIBM | 11 | IBM |
| 182 | Qwen3-Omni-30B-A3B阿里巴巴 | 11 | 阿里巴巴 |
| 183 | Qwen3.5 0.8BAlibaba | 11 | Alibaba |
| 184 | LFM2 24B A2BLiquid AI | 10 | Liquid AI |
| 185 | Phi 4 - 14BMicrosoft Azure | 10 | Microsoft Azure |
| 186 | Amazon Nova Micro亚马逊 | 10 | 亚马逊 |
| 187 | NVIDIA Nemotron Nano 12B v2 VLNVIDIA | 10 | NVIDIA |
| 188 | Phi-4-multimodal-instruct Microsoft Azure | 10 | Microsoft Azure |
| 189 | Qwen3.5 0.8BAlibaba | 10 | Alibaba |
| 190 | Jamba Reasoning 3BAI21 Labs | 10 | AI21 Labs |
| 191 | Gemini 3.0 FlashGoogle Deep Mind | 10 | Google Deep Mind |
| 192 | Ling-mini-2.0InclusionAI | 9 | InclusionAI |
| 193 | Llama 3.2-Vision-11BFacebook AI研究实验室 | 9 | Facebook AI研究实验室 |
| 194 | Granite 4.1 3BIBM | 9 | IBM |
| 195 | Phi-4-mini-instruct (3.8B)Microsoft Azure | 8 | Microsoft Azure |
| 196 | Exaone 4.0 1.2BLG AI Research | 8 | LG AI Research |
| 197 | Exaone 4.0 1.2BLG AI Research | 8 | LG AI Research |
| 198 | LFM2.5-1.2B-ThinkingLiquid AI | 8 | Liquid AI |
| 199 | Jamba 1.7 MiniAI21 Labs | 8 | AI21 Labs |
| 200 | LFM2.5-1.2B-InstructLiquid AI | 8 | Liquid AI |
| 201 | LFM2 2.6BLiquid AI | 8 | Liquid AI |
| 202 | Granite 4.0 H 1BIBM | 8 | IBM |
| 203 | Gemma 3-270MGoogle Deep Mind | 8 | Google Deep Mind |
| 204 | Apertus 70B InstructSwiss AI | 8 | Swiss AI |
| 205 | Granite 4.0 MicroIBM | 8 | IBM |
| 206 | Granite 4.0 1BIBM | 7 | IBM |
| 207 | LFM2 8B A1BLiquid AI | 7 | Liquid AI |
| 208 | LFM2.5-VL-1.6BLiquid AI | 6 | Liquid AI |
| 209 | Granite 4.0 350MIBM | 6 | IBM |
| 210 | Apertus 8B InstructSwiss AI | 6 | Swiss AI |
| 211 | Granite 4.0 H 350MIBM | 5 | IBM |
| 212 | Tiny Aya GlobalCohere | 5 | Cohere |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
评测基准组成(Intelligence Index v4.0)
Intelligence Index 综合10项严格的评测基准,全面衡量AI模型能力,避免单一维度的过拟合。
GDPval-AA
智能体真实任务
τ²-Bench
智能体工具调用
Terminal-Bench
智能体编程
SciCode
编程能力
AA-LCR
长上下文推理
AA-Omniscience
知识与幻觉检测
IFBench
指令遵循
Humanity's Last Exam
推理与知识
GPQA Diamond
科学推理
CritPt
物理推理
常见问题 (FAQ)
什么是 Artificial Analysis Intelligence Index?▼
Artificial Analysis Intelligence Index v4.0 是一个综合评测指数,聚合了10项具有挑战性的评估——涵盖数学、科学、编程、智能体任务和推理——以全面衡量AI能力。它旨在防止单一维度的过拟合,提供一个统一分数来追踪模型进步。
智能指数是如何计算的?▼
该指数综合了10项评测的分数:GDPval-AA(智能体真实任务)、τ²-Bench(工具调用)、Terminal-Bench Hard(智能体编程)、SciCode(编程)、AA-LCR(长上下文推理)、AA-Omniscience(知识与幻觉检测)、IFBench(指令遵循)、Humanity's Last Exam(推理)、GPQA Diamond(科学推理)和 CritPt(物理推理)。所有测试由 Artificial Analysis 在标准化硬件上独立运行。
这与 LMArena 排行榜有什么区别?▼
LMArena 排名基于众包用户投票(盲测A/B对比的Elo评分),反映主观的人类偏好。而 Artificial Analysis Intelligence Index 使用标准化的自动评测基准进行客观评分,衡量特定领域的技术能力。两者各有价值——LMArena 捕捉真实用户体验,而 AA Intelligence Index 提供可复现的技术测量。
在哪里可以找到原始数据?▼
原始排行榜和详细方法论可在 artificialanalysis.ai 查看。Intelligence Index 的方法论详见 Intelligence Index 页面。















