Text Generation Arena 文本生成模型排行榜
基于 Text Generation Arena 用户匿名投票的最新AI文本生成模型排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。
榜首模型
Claude Opus 4.6
最高得分
1,502
模型数量
150
数据版本
2026年04月14日
数据来源: LM Arena
关于本排行榜
本排行榜展示了当前最强 AI 大模型在文本生成任务中的综合实力排名。数据来源于 LMArena(前身为 LMSYS Chatbot Arena),这是目前全球最大的 AI 模型众包评测平台。用户在平台上与两个匿名模型同时对话,并投票选出更好的回答——排名完全由真实用户的偏好决定,而非实验室基准测试。
评测方法概要
匿名盲测:用户同时与两个"隐藏身份"的模型对话,根据回答质量投票,排除品牌偏见。
Elo 评分:基于国际象棋领域的 Elo Rating 体系(Bradley-Terry 模型),通过对战结果计算每个模型的实力分数。分数越高,说明模型在真实对话中被用户选中的概率越大。
场景覆盖广泛:涵盖编程、创意写作、数学推理、知识问答、角色扮演等高频真实场景。
DataLearner 在原始数据基础上提供中文解读与深度分析,并将排行榜模型关联至 DataLearner 模型库,方便您一键查看模型详情、API 定价、评测得分等完整信息。
筛选条件
榜单历史快照月份:
排名总表
| 排名 | 模型名称 | 得分 | 95% CI | 投票数 | 机构 | 许可证 |
|---|---|---|---|---|---|---|
| 1 | Claude Opus 4.6 | 1,502 | / | 17,219 | Anthropic | / |
| 2 | Claude Opus 4.6 | 1,496 | / | 18,377 | Anthropic | / |
| 3 | Muse Spark | 1,495 | / | 4,182 | Facebook AI研究实验室 | / |
| 4 | Gemini 3.1 Pro Preview | 1,493 | / | 21,708 | Google Deep Mind | / |
| 5 | Gemini 3.0 Pro (Preview 11-2025) | 1,486 | / | 41,578 | Google Deep Mind | / |
| 6 | grok-4.20-beta1 | 1,485 | / | 10,884 | xAI | / |
| 7 | gpt-5.4-high | 1,481 | / | 10,633 | OpenAI | / |
| 8 | grok-4.20-beta-0309-reasoning | 1,479 | / | 10,713 | xAI | / |
| 9 | gpt-5.2-chat-latest-20260210 | 1,476 | / | 16,810 | OpenAI | / |
| 10 | grok-4.20-multi-agent-beta-0309 | 1,476 | / | 11,079 | xAI | / |
| 11 | Gemini 3.0 Flash | 1,474 | / | 30,922 | Google Deep Mind | / |
| 12 | Claude Opus 4 | 1,473 | / | 37,292 | Anthropic | / |
| 13 | GLM 5.1 | 1,471 | / | 6,274 | 智谱AI | / |
| 14 | Grok 4.1 Thinking | 1,470 | / | 48,508 | xAI | / |
| 15 | Claude Opus 4 | 1,469 | / | 48,318 | Anthropic | / |
| 16 | gpt-5.4 | 1,466 | / | 10,990 | OpenAI | / |
| 17 | qwen3.5-max-preview | 1,466 | / | 8,774 | Alibaba | / |
| 18 | Gemini 3.0 Flash | 1,462 | / | 34,519 | Google Deep Mind | / |
| 19 | claude-sonnet-4-6 | 1,461 | / | 10,935 | Anthropic | / |
| 20 | dola-seed-2.0-pro | 1,460 | / | 19,770 | Bytedance | / |
| 21 | Grok 4.1 | 1,460 | / | 52,460 | xAI | / |
| 22 | gpt-5.4-mini-high | 1,457 | / | 8,174 | OpenAI | / |
| 23 | GLM-5 | 1,456 | / | 14,988 | 智谱AI | / |
| 24 | gpt-5.3-chat-latest | 1,455 | / | 15,448 | OpenAI | / |
| 25 | GPT-5.1 Pro | 1,454 | / | 41,035 | OpenAI | / |
| 26 | claude-sonnet-4-5-20250929-thinking-32k | 1,452 | / | 61,159 | Anthropic | / |
| 27 | claude-sonnet-4-5-20250929 | 1,451 | / | 59,047 | Anthropic | / |
| 28 | Kimi K2 Thinking | 1,451 | / | 21,678 | Moonshot AI | / |
| 29 | gemma-4-31b | 1,450 | / | 5,839 | / | |
| 30 | ERNIE 5.0 | 1,450 | / | 23,507 | 百度 | / |
| 31 | ERNIE 5.0 | 1,449 | / | 9,808 | 百度 | / |
| 32 | claude-opus-4-1-20250805-thinking-16k | 1,448 | / | 50,147 | Anthropic | / |
| 33 | Gemini 2.5 Pro Experimental 03-25 | 1,448 | / | 108,717 | Google Deep Mind | / |
| 34 | mimo-v2-pro | 1,447 | / | 9,239 | Xiaomi | / |
| 35 | claude-opus-4-1-20250805 | 1,447 | / | 77,831 | Anthropic | / |
| 36 | Qwen3.5-397B-A17B | 1,446 | / | 16,360 | 阿里巴巴 | / |
| 37 | GPT-4.5 | 1,444 | / | 14,547 | OpenAI | / |
| 38 | chatgpt-4o-latest-20250326 | 1,443 | / | 82,981 | OpenAI | / |
| 39 | GLM-4.7 | 1,443 | / | 12,180 | 智谱AI | / |
| 40 | GPT-5.2 Pro | 1,441 | / | 31,439 | OpenAI | / |
| 41 | GPT-5.2 | 1,439 | / | 28,519 | OpenAI | / |
| 42 | gemma-4-26b-a4b | 1,439 | / | 5,795 | / | |
| 43 | GPT-5.1 Instant | 1,438 | / | 43,688 | OpenAI | / |
| 44 | longcat-flash-chat-2602-exp | 1,437 | / | 6,751 | Meituan | / |
| 45 | gemini-3.1-flash-lite-preview | 1,436 | / | 16,969 | / | |
| 46 | Qwen3 Max (Preview) | 1,435 | / | 27,926 | 阿里巴巴 | / |
| 47 | GPT-5-Pro | 1,433 | / | 32,239 | OpenAI | / |
| 48 | kimi-k2.5-instant | 1,432 | / | 8,234 | Moonshot | / |
| 49 | grok-4-1-fast-reasoning | 1,432 | / | 43,555 | xAI | / |
| 50 | OpenAI o3 | 1,431 | / | 60,167 | OpenAI | / |
| 51 | kimi-k2-thinking-turbo | 1,430 | / | 47,037 | Moonshot | / |
| 52 | amazon-nova-experimental-chat-26-02-10 | 1,428 | / | 3,448 | Amazon | / |
| 53 | GPT-5 | 1,426 | / | 31,842 | OpenAI | / |
| 54 | GLM-4.6 | 1,426 | / | 35,904 | 智谱AI | / |
| 55 | DeepSeek V3.2-Exp | 1,425 | / | 9,140 | DeepSeek-AI | / |
| 56 | qwen3-max-2025-09-23 | 1,424 | / | 9,244 | Alibaba | / |
| 57 | claude-opus-4-20250514-thinking-16k | 1,424 | / | 37,185 | Anthropic | / |
| 58 | DeepSeek V3.2 | 1,423 | / | 42,036 | DeepSeek-AI | / |
| 59 | Qwen3-235B-A22B-2507 | 1,423 | / | 82,850 | 阿里巴巴 | / |
| 60 | DeepSeek V3.2 | 1,423 | / | 36,441 | DeepSeek-AI | / |
| 61 | DeepSeek V3.2-Exp | 1,423 | / | 12,013 | DeepSeek-AI | / |
| 62 | DeepSeek-R1-0528 | 1,422 | / | 18,590 | DeepSeek-AI | / |
| 63 | Grok 4 Fast | 1,421 | / | 6,864 | xAI | / |
| 64 | ERNIE 5.0 | 1,419 | / | 4,762 | 百度 | / |
| 65 | qwen3.5-122b-a10b | 1,418 | / | 13,066 | Alibaba | / |
| 66 | kimi-k2-0905-preview | 1,418 | / | 11,862 | Moonshot | / |
| 67 | DeepSeek-V3.1 | 1,418 | / | 15,068 | DeepSeek-AI | / |
| 68 | Kimi K2 | 1,417 | / | 27,861 | Moonshot AI | / |
| 69 | DeepSeek-V3.1 | 1,417 | / | 11,825 | DeepSeek-AI | / |
| 70 | deepseek-v3.1-terminus-thinking | 1,416 | / | 3,488 | DeepSeek | / |
| 71 | DeepSeek-V3.1 Terminus | 1,416 | / | 3,722 | DeepSeek-AI | / |
| 72 | Qwen3-VL-235B-A22B-Instruct | 1,416 | / | 11,608 | 阿里巴巴 | / |
| 73 | Mistral Large 3 | 1,415 | / | 39,232 | MistralAI | / |
| 74 | amazon-nova-experimental-chat-26-01-10 | 1,415 | / | 3,432 | Amazon | / |
| 75 | gpt-4.1-2025-04-14 | 1,413 | / | 51,399 | OpenAI | / |
| 76 | Claude Opus 4 | 1,412 | / | 44,550 | Anthropic | / |
| 77 | Grok 3 | 1,412 | / | 33,045 | xAI | / |
| 78 | Gemini 2.5 Flash | 1,411 | / | 108,193 | Google Deep Mind | / |
| 79 | GLM-4.5 | 1,411 | / | 24,507 | 智谱AI | / |
| 80 | grok-4-0709 | 1,410 | / | 41,734 | xAI | / |
| 81 | Magistral-Medium-2506 | 1,410 | / | 78,272 | MistralAI | / |
| 82 | claude-haiku-4-5-20251001 | 1,408 | / | 60,452 | Anthropic | / |
| 83 | gemini-2.5-flash-preview-09-2025 | 1,405 | / | 33,128 | / | |
| 84 | grok-4-fast-reasoning | 1,404 | / | 18,875 | xAI | / |
| 85 | qwen3-235b-a22b-no-thinking | 1,403 | / | 38,470 | Alibaba | / |
| 86 | minimax-m2.7 | 1,402 | / | 7,635 | MiniMax | / |
| 87 | gpt-5.4-nano-high | 1,402 | / | 7,479 | OpenAI | / |
| 88 | MiniMax M2.5 | 1,402 | / | 18,224 | MiniMaxAI | / |
| 89 | qwen3.5-27b | 1,402 | / | 12,770 | Alibaba | / |
| 90 | o1-2024-12-17 | 1,401 | / | 27,807 | OpenAI | / |
| 91 | qwen3-next-80b-a3b-instruct | 1,401 | / | 23,060 | Alibaba | / |
| 92 | longcat-flash-chat | 1,401 | / | 11,476 | Meituan | / |
| 93 | qwen3-235b-a22b-thinking-2507 | 1,400 | / | 9,061 | Alibaba | / |
| 94 | qwen3.5-flash | 1,399 | / | 13,598 | Alibaba | / |
| 95 | claude-sonnet-4-20250514-thinking-32k | 1,398 | / | 35,416 | Anthropic | / |
| 96 | DeepSeek-R1 | 1,397 | / | 18,524 | DeepSeek-AI | / |
| 97 | hunyuan-vision-1.5-thinking | 1,397 | / | 2,225 | Tencent | / |
| 98 | qwen3.5-35b-a3b | 1,396 | / | 13,268 | Alibaba | / |
| 99 | qwen3-vl-235b-a22b-thinking | 1,396 | / | 8,021 | Alibaba | / |
| 100 | amazon-nova-experimental-chat-12-10 | 1,395 | / | 3,707 | Amazon | / |
| 101 | DeepSeek-V3-0324 | 1,394 | / | 45,799 | DeepSeek-AI | / |
| 102 | mai-1-preview | 1,393 | / | 18,015 | Microsoft AI | / |
| 103 | mimo-v2-flash (non-thinking) | 1,391 | / | 31,132 | Xiaomi | / |
| 104 | Step 3.5 Flash | 1,391 | / | 19,379 | StepFunAI | / |
| 105 | o4-mini-2025-04-16 | 1,390 | / | 45,738 | OpenAI | / |
| 106 | gpt-5-mini-high | 1,389 | / | 27,246 | OpenAI | / |
| 107 | claude-sonnet-4-20250514 | 1,389 | / | 40,649 | Anthropic | / |
| 108 | o1-preview | 1,388 | / | 31,122 | OpenAI | / |
| 109 | qwen3-coder-480b-a35b-instruct | 1,387 | / | 25,958 | Alibaba | / |
| 110 | hunyuan-t1-20250711 | 1,387 | / | 4,736 | Tencent | / |
| 111 | mimo-v2-flash (thinking) | 1,387 | / | 11,021 | Xiaomi | / |
| 112 | claude-3-7-sonnet-20250219-thinking-32k | 1,386 | / | 38,995 | Anthropic | / |
| 113 | mistral-medium-2505 | 1,386 | / | 33,442 | Mistral | / |
| 114 | minimax-m2.1-preview | 1,385 | / | 17,234 | MiniMax | / |
| 115 | qwen3-30b-a3b-instruct-2507 | 1,383 | / | 23,932 | Alibaba | / |
| 116 | hunyuan-turbos-20250416 | 1,383 | / | 10,775 | Tencent | / |
| 117 | gpt-4.1-mini-2025-04-14 | 1,382 | / | 39,548 | OpenAI | / |
| 118 | gemini-2.5-flash-lite-preview-09-2025-no-thinking | 1,380 | / | 47,541 | / | |
| 119 | GLM-4.6V | 1,378 | / | 2,817 | 智谱AI | / |
| 120 | trinity-large-preview | 1,374 | / | 14,164 | Arcee AI | / |
| 121 | gemini-2.5-flash-lite-preview-06-17-thinking | 1,374 | / | 33,170 | / | |
| 122 | qwen3-235b-a22b | 1,374 | / | 26,423 | Alibaba | / |
| 123 | qwen2.5-max | 1,374 | / | 32,709 | Alibaba | / |
| 124 | glm-4.5-air | 1,373 | / | 31,372 | Z.ai | / |
| 125 | claude-3-5-sonnet-20241022 | 1,372 | / | 88,515 | Anthropic | / |
| 126 | claude-3-7-sonnet-20250219 | 1,370 | / | 43,394 | Anthropic | / |
| 127 | qwen3-next-80b-a3b-thinking | 1,369 | / | 13,836 | Alibaba | / |
| 128 | glm-4.7-flash | 1,368 | / | 11,819 | Z.ai | / |
| 129 | amazon-nova-experimental-chat-11-10 | 1,367 | / | 25,539 | Amazon | / |
| 130 | gemma-3-27b-it | 1,365 | / | 47,842 | / | |
| 131 | minimax-m1 | 1,363 | / | 35,505 | MiniMax | / |
| 132 | o3-mini-high | 1,363 | / | 18,589 | OpenAI | / |
| 133 | grok-3-mini-high | 1,363 | / | 17,078 | xAI | / |
| 134 | nvidia-nemotron-3-super-120b-a12b | 1,361 | / | 7,435 | Nvidia | / |
| 135 | gemini-2.0-flash-001 | 1,360 | / | 43,911 | / | |
| 136 | deepseek-v3 | 1,358 | / | 21,770 | DeepSeek | / |
| 137 | grok-3-mini-beta | 1,357 | / | 22,879 | xAI | / |
| 138 | mistral-small-2506 | 1,357 | / | 17,850 | Mistral | / |
| 139 | intellect-3 | 1,356 | / | 5,357 | Prime Intellect | / |
| 140 | gpt-oss-120b | 1,354 | / | 30,883 | OpenAI | / |
| 141 | command-a-03-2025 | 1,353 | / | 56,663 | Cohere | / |
| 142 | glm-4.5v | 1,353 | / | 4,983 | Z.ai | / |
| 143 | gemini-2.0-flash-lite-preview-02-05 | 1,353 | / | 24,955 | / | |
| 144 | gemini-1.5-pro-002 | 1,351 | / | 55,606 | / | |
| 145 | amazon-nova-experimental-chat-10-20 | 1,350 | / | 11,535 | Amazon | / |
| 146 | hunyuan-turbos-20250226 | 1,348 | / | 2,220 | Tencent | / |
| 147 | step-3 | 1,348 | / | 6,585 | StepFun | / |
| 148 | o3-mini | 1,347 | / | 57,563 | OpenAI | / |
| 149 | qwen3-32b | 1,347 | / | 3,926 | Alibaba | / |
| 150 | llama-3.1-nemotron-ultra-253b-v1 | 1,347 | / | 2,549 | Nvidia | / |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。