加载中...
加载中...
基于 Text Generation Arena 用户匿名投票的最新AI文本生成模型排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。
榜首模型
claude-opus-4-6-thinking
最高得分
1,502
模型数量
100
数据版本
2026年03月20日
数据来源: LM Arena
本排行榜展示了当前最强 AI 大模型在文本生成任务中的综合实力排名。数据来源于 LMArena(前身为 LMSYS Chatbot Arena),这是目前全球最大的 AI 模型众包评测平台。用户在平台上与两个匿名模型同时对话,并投票选出更好的回答——排名完全由真实用户的偏好决定,而非实验室基准测试。
匿名盲测:用户同时与两个"隐藏身份"的模型对话,根据回答质量投票,排除品牌偏见。
Elo 评分:基于国际象棋领域的 Elo Rating 体系(Bradley-Terry 模型),通过对战结果计算每个模型的实力分数。分数越高,说明模型在真实对话中被用户选中的概率越大。
场景覆盖广泛:涵盖编程、创意写作、数学推理、知识问答、角色扮演等高频真实场景。
DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。
图表来源:DataLearnerAI · 数据来源:LMArena
| 排名 | 模型名称 | 得分 | 95% CI | 投票数 | 机构 | 许可证 |
|---|---|---|---|---|---|---|
| 1 | claude-opus-4-6-thinking | 1,502 | +/-6 | 11,801 | Anthropic | Proprietary |
| 2 | claude-opus-4-6 | 1,501 | +/-6 | 12,546 | Anthropic | Proprietary |
| 3 | gemini-3.1-pro-preview | 1,493 | +/-6 | 14,677 | Proprietary | |
| 4 | grok-4.20-beta1 | 1,492 | +/-7 | 7,396 | xAI | Proprietary |
| 5 | gemini-3-pro | 1,486 | +/-4 | 41,762 | Proprietary | |
| 6 | gpt-5.4-high | 1,485 | +/-9 | 4,965 | OpenAI | Proprietary |
| 7 | gpt-5.2-chat-latest-20260210 | 1,482 | +/-6 | 10,140 | OpenAI | Proprietary |
| 8 | grok-4.20-beta-0309-reasoning | 1,481 | +/-9 | 4,504 | xAI | Proprietary |
| 9 | gemini-3-flash | 1,475 | +/-4 | 31,060 | Proprietary | |
| 10 | claude-opus-4-5-20251101-thinking-32k | 1,474 | +/-4 | 37,036 | Anthropic | Proprietary |
| 11 | grok-4.1-thinking | 1,472 | +/-4 | 43,930 | xAI | Proprietary |
| 12 | claude-opus-4-5-20251101 | 1,469 | +/-4 | 41,976 | Anthropic | Proprietary |
| 13 | claude-sonnet-4-6 | 1,465 | +/-6 | 9,843 | Anthropic | Proprietary |
| 14 | qwen3.5-max-preview | 1,464 | +/-9 | 4,252 | Alibaba | Proprietary |
| 15 | gpt-5.3-chat-latest | 1,464 | +/-7 | 8,942 | OpenAI | Proprietary |
| 16 | gemini-3-flash (thinking-minimal) | 1,463 | +/-4 | 27,448 | Proprietary | |
| 17 | gpt-5.4 | 1,463 | +/-8 | 4,972 | OpenAI | Proprietary |
| 18 | dola-seed-2.0-preview | 1,462 | +/-6 | 10,651 | Bytedance | Proprietary |
| 19 | grok-4.1 | 1,461 | +/-4 | 47,757 | xAI | Proprietary |
| 20 | gpt-5.1-high | 1,455 | +/-4 | 40,759 | OpenAI | Proprietary |
| 21 | glm-5 | 1,455 | +/-6 | 11,093 | Z.ai | MIT |
| 22 | kimi-k2.5-thinking | 1,453 | +/-5 | 16,262 | Moonshot | Modified MIT |
| 23 | claude-sonnet-4-5-20250929 | 1,453 | +/-3 | 53,556 | Anthropic | Proprietary |
| 24 | claude-sonnet-4-5-20250929-thinking-32k | 1,453 | +/-3 | 55,811 | Anthropic | Proprietary |
| 25 | ernie-5.0-0110 | 1,452 | +/-5 | 18,715 | Baidu | Proprietary |
| 26 | qwen3.5-397b-a17b | 1,452 | +/-6 | 10,431 | Alibaba | Apache 2.0 |
| 27 | ernie-5.0-preview-1203 | 1,450 | +/-7 | 9,857 | Baidu | Proprietary |
| 28 | claude-opus-4-1-20250805-thinking-16k | 1,449 | +/-3 | 50,375 | Anthropic | Proprietary |
| 29 | gemini-2.5-pro | 1,448 | +/-3 | 103,317 | Proprietary | |
| 30 | claude-opus-4-1-20250805 | 1,447 | +/-3 | 78,224 | Anthropic | Proprietary |
| 31 | mimo-v2-pro | 1,445 | +/-10 | 3,531 | Xiaomi | Proprietary |
| 32 | gpt-4.5-preview-2025-02-27 | 1,444 | +/-6 | 14,547 | OpenAI | Proprietary |
| 33 | chatgpt-4o-latest-20250326 | 1,443 | +/-3 | 83,559 | OpenAI | Proprietary |
| 34 | glm-4.7 | 1,443 | +/-6 | 12,242 | Z.ai | MIT |
| 35 | gpt-5.2-high | 1,442 | +/-5 | 25,328 | OpenAI | Proprietary |
| 36 | gpt-5.2 | 1,440 | +/-5 | 22,231 | OpenAI | Proprietary |
| 37 | gpt-5.1 | 1,439 | +/-4 | 43,475 | OpenAI | Proprietary |
| 38 | gemini-3.1-flash-lite-preview | 1,438 | +/-9 | 3,881 | Proprietary | |
| 39 | qwen3-max-preview | 1,435 | +/-4 | 28,066 | Alibaba | Proprietary |
| 40 | gpt-5-high | 1,434 | +/-5 | 32,470 | OpenAI | Proprietary |
| 41 | kimi-k2.5-instant | 1,433 | +/-7 | 8,257 | Moonshot | Modified MIT |
| 42 | o3-2025-04-16 | 1,432 | +/-4 | 60,698 | OpenAI | Proprietary |
| 43 | grok-4-1-fast-reasoning | 1,431 | +/-4 | 37,473 | xAI | Proprietary |
| 44 | kimi-k2-thinking-turbo | 1,430 | +/-4 | 41,738 | Moonshot | Modified MIT |
| 45 | amazon-nova-exp-chat-26-02-10 | 1,429 | +/-10 | 3,467 | Amazon | Proprietary |
| 46 | gpt-5-chat | 1,426 | +/-4 | 32,009 | OpenAI | Proprietary |
| 47 | glm-4.6 | 1,426 | +/-4 | 36,102 | Z.ai | MIT |
| 48 | deepseek-v3.2-exp-thinking | 1,425 | +/-7 | 9,188 | DeepSeek | MIT |
| 49 | deepseek-v3.2 | 1,425 | +/-4 | 36,511 | DeepSeek | MIT |
| 50 | qwen3-max-2025-09-23 | 1,424 | +/-6 | 9,273 | Alibaba | Proprietary |
| 51 | claude-opus-4-20250514-thinking | 1,424 | +/-4 | 37,503 | Anthropic | Proprietary |
| 52 | deepseek-v3.2-exp | 1,423 | +/-6 | 12,088 | DeepSeek | MIT |
| 53 | qwen3-235b-a22b-instruct | 1,422 | +/-3 | 77,683 | Alibaba | Apache 2.0 |
| 54 | deepseek-v3.2-thinking | 1,422 | +/-4 | 31,048 | DeepSeek | MIT |
| 55 | deepseek-r1-0528 | 1,421 | +/-6 | 18,831 | DeepSeek | MIT |
| 56 | grok-4-fast-chat | 1,421 | +/-8 | 6,901 | xAI | Proprietary |
| 57 | ernie-5.0-preview-1022 | 1,419 | +/-9 | 4,782 | Baidu | Proprietary |
| 58 | deepseek-v3.1 | 1,418 | +/-6 | 15,150 | DeepSeek | MIT |
| 59 | kimi-k2-0905-preview | 1,418 | +/-6 | 11,924 | Moonshot | Modified MIT |
| 60 | qwen3.5-122b-a10b | 1,417 | +/-7 | 6,946 | Alibaba | Apache 2.0 |
| 61 | kimi-k2-0711-preview | 1,417 | +/-5 | 28,082 | Moonshot | Modified MIT |
| 62 | deepseek-v3.1-thinking | 1,417 | +/-7 | 11,885 | DeepSeek | MIT |
| 63 | deepseek-v3.1-terminus-think | 1,416 | +/-10 | 3,497 | DeepSeek | MIT |
| 64 | mistral-large-3 | 1,416 | +/-4 | 33,200 | Mistral | Apache 2.0 |
| 65 | deepseek-v3.1-terminus | 1,416 | +/-10 | 3,736 | DeepSeek | MIT |
| 66 | qwen3-vl-235b-a22b-instruct | 1,415 | +/-6 | 11,645 | Alibaba | Apache 2.0 |
| 67 | amazon-nova-exp-chat-26-01-10 | 1,414 | +/-10 | 3,439 | Amazon | Proprietary |
| 68 | gpt-4.1-2025-04-14 | 1,413 | +/-4 | 51,831 | OpenAI | Proprietary |
| 69 | claude-opus-4-20250514 | 1,413 | +/-4 | 44,988 | Anthropic | Proprietary |
| 70 | grok-3-preview-02-24 | 1,412 | +/-4 | 33,374 | xAI | Proprietary |
| 71 | gemini-2.5-flash | 1,411 | +/-3 | 102,736 | Proprietary | |
| 72 | glm-4.5 | 1,411 | +/-5 | 24,640 | Z.ai | MIT |
| 73 | grok-4-0709 | 1,410 | +/-4 | 42,034 | xAI | Proprietary |
| 74 | mistral-medium-2508 | 1,410 | +/-3 | 72,410 | Mistral | Proprietary |
| 75 | minimax-m2.7 | 1,407 | +/-11 | 2,981 | MiniMax | Proprietary |
| 76 | claude-haiku-4-5-20251001 | 1,407 | +/-3 | 54,261 | Anthropic | Proprietary |
| 77 | qwen3.5-27b | 1,406 | +/-7 | 6,957 | Alibaba | Apache 2.0 |
| 78 | minimax-m2.5 | 1,405 | +/-6 | 11,909 | MiniMax | Modified MIT |
| 79 | gemini-2.5-flash-preview | 1,405 | +/-4 | 33,278 | Proprietary | |
| 80 | grok-4-fast-reasoning | 1,405 | +/-5 | 18,993 | xAI | Proprietary |
| 81 | qwen3-235b-a22b-no-thinking | 1,403 | +/-4 | 38,797 | Alibaba | Apache 2.0 |
| 82 | o1-2024-12-17 | 1,402 | +/-4 | 27,807 | OpenAI | Proprietary |
| 83 | qwen3-next-80b-a3b-instruct | 1,401 | +/-5 | 23,187 | Alibaba | Apache 2.0 |
| 84 | qwen3.5-flash | 1,401 | +/-7 | 7,853 | Alibaba | Proprietary |
| 85 | qwen3.5-35b-a3b | 1,401 | +/-7 | 7,278 | Alibaba | Apache 2.0 |
| 86 | longcat-flash-chat | 1,400 | +/-6 | 11,517 | Meituan | MIT |
| 87 | qwen3-235b-a22b-thinking | 1,399 | +/-7 | 9,128 | Alibaba | Apache 2.0 |
| 88 | claude-sonnet-4-thinking | 1,399 | +/-4 | 35,733 | Anthropic | Proprietary |
| 89 | deepseek-r1 | 1,398 | +/-5 | 18,524 | DeepSeek | MIT |
| 90 | hunyuan-vision-1.5-thinking | 1,396 | +/-12 | 2,235 | Tencent | Proprietary |
| 91 | qwen3-vl-235b-a22b-thinking | 1,396 | +/-7 | 8,052 | Alibaba | Apache 2.0 |
| 92 | amazon-nova-exp-chat-12-10 | 1,396 | +/-10 | 3,720 | Amazon | Proprietary |
| 93 | deepseek-v3-0324 | 1,394 | +/-4 | 46,144 | DeepSeek | MIT |
| 94 | mai-1-preview | 1,393 | +/-5 | 18,095 | Microsoft AI | Proprietary |
| 95 | mimo-v2-flash (non-thinking) | 1,392 | +/-4 | 25,427 | Xiaomi | MIT |
| 96 | o4-mini-2025-04-16 | 1,390 | +/-4 | 46,166 | OpenAI | Proprietary |
| 97 | gpt-5-mini-high | 1,390 | +/-5 | 27,372 | OpenAI | Proprietary |
| 98 | claude-sonnet-4-20250514 | 1,389 | +/-4 | 41,021 | Anthropic | Proprietary |
| 99 | step-3.5-flash | 1,389 | +/-6 | 13,885 | StepFun | Apache 2.0 |
| 100 | o1-preview | 1,388 | +/-5 | 31,122 | OpenAI | Proprietary |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。