Text Generation Arena Leaderboard
The latest AI text generation leaderboard based on LMArena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for leading language models.
Top Model
Opus 4.7 (thinking)
Top Score
1,503
Model Count
357
Data version
2026年05月07日
Data source: LM Arena
About This Leaderboard
This leaderboard ranks the strongest AI models for text generation. Data comes from LMArena (formerly LMSYS Chatbot Arena), the world's largest crowdsourced AI evaluation platform. Users chat with two anonymous models side-by-side and vote for the better response — rankings are determined entirely by real user preferences, not lab benchmarks.
Methodology Overview
Blind testing: Users chat with two anonymous models and vote based on response quality, eliminating brand bias.
Elo scoring: Using the Bradley-Terry model (adapted from chess Elo ratings) to calculate each model's strength score from battle outcomes. Higher scores mean users more frequently prefer that model.
Broad scenario coverage: Testing spans coding, creative writing, math reasoning, Q&A, role-playing, and more.
DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.
Ranking Table
| Rank | Model | Score | 95% CI | Votes | Organization | License |
|---|---|---|---|---|---|---|
Opus 4.7 (thinking)Anthropic | 1,503 | +/-6 | 8,945 | Anthropic | Proprietary | |
Claude Opus 4.6 (thinking)Anthropic | 1,502 | +/-5 | 23,616 | Anthropic | Proprietary | |
Claude Opus 4.6Anthropic | 1,498 | +/-5 | 25,089 | Anthropic | Proprietary | |
| 4 | Gemini 3.1 Pro PreviewGoogle Deep Mind | 1,492 | +/-4 | 29,468 | Google Deep Mind | Proprietary |
| 5 | Opus 4.7Anthropic | 1,491 | +/-6 | 9,614 | Anthropic | Proprietary |
| 6 | Muse SparkFacebook AI研究实验室 | 1,490 | +/-6 | 10,491 | Facebook AI研究实验室 | Proprietary |
| 7 | Gemini 3.0 Pro (Preview 11-2025)Google Deep Mind | 1,486 | +/-4 | 41,381 | Google Deep Mind | Proprietary |
| 8 | gpt-5.5-highOpenAI | 1,484 | +/-7 | 6,488 | OpenAI | Proprietary |
| 9 | 1,480 | +/-5 | 18,791 | xAI | Proprietary | |
| 10 | gpt-5.2-chat-latest-20260210OpenAI | 1,477 | +/-5 | 23,717 | OpenAI | Proprietary |
| 11 | gpt-5.4-highOpenAI | 1,477 | +/-5 | 17,146 | OpenAI | Proprietary |
| 12 | 1,477 | +/-5 | 17,538 | xAI | Proprietary | |
| 13 | GPT-5.5OpenAI | 1,475 | +/-7 | 6,653 | OpenAI | Proprietary |
| 14 | ernie-5.1Baidu | 1,474 | +/-8 | 5,733 | Baidu | Proprietary |
| 15 | 1,474 | +/-5 | 17,728 | xAI | Proprietary | |
| 16 | Gemini 3.0 FlashGoogle Deep Mind | 1,474 | +/-4 | 30,784 | Google Deep Mind | Proprietary |
| 17 | Claude Opus 4 (thinking-32k)Anthropic | 1,473 | +/-4 | 37,168 | Anthropic | Proprietary |
| 18 | gpt-5.5-instantOpenAI | 1,473 | +/-11 | 2,833 | OpenAI | Proprietary |
| 19 | GLM 5.1智谱AI | 1,471 | +/-6 | 11,349 | 智谱AI | MIT |
| 20 | Claude Opus 4Anthropic | 1,468 | +/-3 | 54,886 | Anthropic | Proprietary |
| 21 | GPT-5.4OpenAI | 1,468 | +/-5 | 17,925 | OpenAI | Proprietary |
| 22 | 1,467 | +/-3 | 55,257 | xAI | Proprietary | |
| 23 | Claude Sonnet 4.6Anthropic | 1,466 | +/-5 | 17,127 | Anthropic | Proprietary |
| 24 | mimo-v2.5-proXiaomi | 1,464 | +/-7 | 6,238 | Xiaomi | MIT |
| 25 | qwen3.5-max-previewAlibaba | 1,464 | +/-5 | 14,558 | Alibaba | Proprietary |
| 26 | Gemini 3.0 Flash (minimal)Google Deep Mind | 1,463 | +/-4 | 41,346 | Google Deep Mind | Proprietary |
| 27 | DeepSeek-V4-ProDeepSeek-AI | 1,463 | +/-9 | 4,160 | DeepSeek-AI | MIT |
| 28 | Kimi K2.6Moonshot AI | 1,462 | +/-7 | 7,108 | Moonshot AI | Modified MIT |
| 29 | deepseek-v4-pro-thinkingDeepSeek | 1,462 | +/-9 | 3,808 | DeepSeek | MIT |
| 30 | 1,459 | +/-3 | 59,206 | xAI | Proprietary | |
| 31 | dola-seed-2.0-proBytedance | 1,459 | +/-5 | 26,587 | Bytedance | Proprietary |
| 32 | Qwen3.6-Max-Preview阿里巴巴 | 1,457 | +/-9 | 3,965 | 阿里巴巴 | Proprietary |
| 33 | GLM-5智谱AI | 1,457 | +/-5 | 20,292 | 智谱AI | MIT |
| 34 | gpt-5.4-mini-highOpenAI | 1,456 | +/-5 | 14,952 | OpenAI | Proprietary |
| 35 | 1,455 | +/-8 | 5,234 | xAI | Proprietary | |
| 36 | GPT-5.1 Pro (high)OpenAI | 1,455 | +/-4 | 40,891 | OpenAI | Proprietary |
| 37 | Claude Sonnet 4.5 (thinking-32k)Anthropic | 1,454 | +/-3 | 67,180 | Anthropic | Proprietary |
| 38 | Claude Sonnet 4.5Anthropic | 1,453 | +/-3 | 65,214 | Anthropic | Proprietary |
| 39 | Gemma 4 31BDeepMind | 1,451 | +/-8 | 5,827 | DeepMind | Apache 2.0 |
| 40 | ERNIE 5.0百度 | 1,450 | +/-4 | 28,724 | 百度 | Proprietary |
| 41 | Kimi K2 ThinkingMoonshot AI | 1,449 | +/-4 | 27,282 | Moonshot AI | Modified MIT |
| 42 | ERNIE 5.0百度 | 1,449 | +/-7 | 9,764 | 百度 | Proprietary |
| 43 | Opus 4.1 (thinking-16k)Anthropic | 1,449 | +/-3 | 49,850 | Anthropic | Proprietary |
| 44 | gpt-5.3-chat-latestOpenAI | 1,449 | +/-5 | 22,474 | OpenAI | Proprietary |
| 45 | Gemini 2.5 Pro Experimental 03-25Google Deep Mind | 1,448 | +/-3 | 114,865 | Google Deep Mind | Proprietary |
| 46 | Qwen 3.6 Plus Preview阿里巴巴 | 1,448 | +/-6 | 8,683 | 阿里巴巴 | Proprietary |
| 47 | Opus 4.1Anthropic | 1,447 | +/-3 | 77,425 | Anthropic | Proprietary |
| 48 | mimo-v2-proXiaomi | 1,447 | +/-5 | 15,257 | Xiaomi | Proprietary |
| 49 | Qwen3.5-397B-A17B阿里巴巴 | 1,446 | +/-5 | 22,471 | 阿里巴巴 | Apache 2.0 |
| 50 | GPT-4.5OpenAI | 1,444 | +/-6 | 14,547 | OpenAI | Proprietary |
| 51 | chatgpt-4o-latest-20250326OpenAI | 1,443 | +/-3 | 82,527 | OpenAI | Proprietary |
| 52 | GLM-4.7智谱AI | 1,443 | +/-6 | 12,142 | 智谱AI | MIT |
| 53 | deepseek-v4-flash-thinkingDeepSeek | 1,440 | +/-9 | 3,600 | DeepSeek | MIT |
| 54 | GPT-5.2 Pro (high)OpenAI | 1,440 | +/-4 | 38,067 | OpenAI | Proprietary |
| 55 | GPT-5.1 InstantOpenAI | 1,439 | +/-4 | 43,533 | OpenAI | Proprietary |
| 56 | gemini-3.1-flash-lite-previewGoogle | 1,438 | +/-5 | 23,715 | Proprietary | |
| 57 | Gemma 4 26B A4BDeepMind | 1,438 | +/-8 | 5,782 | DeepMind | Apache 2.0 |
| 58 | GPT-5.2OpenAI | 1,437 | +/-4 | 35,182 | OpenAI | Proprietary |
| 59 | Qwen3 Max (Preview)阿里巴巴 | 1,435 | +/-5 | 27,743 | 阿里巴巴 | Proprietary |
| 60 | longcat-flash-chat-2602-expMeituan | 1,434 | +/-6 | 13,311 | Meituan | Proprietary |
| 61 | GPT-5-Pro (high)OpenAI | 1,434 | +/-5 | 31,963 | OpenAI | Proprietary |
| 62 | DeepSeek-V4-FlashDeepSeek-AI | 1,433 | +/-9 | 3,506 | DeepSeek-AI | MIT |
| 63 | kimi-k2.5-instantMoonshot | 1,432 | +/-7 | 8,207 | Moonshot | Modified MIT |
| 64 | 1,432 | +/-3 | 50,028 | xAI | Proprietary | |
| 65 | OpenAI o3OpenAI | 1,431 | +/-4 | 59,783 | OpenAI | Proprietary |
| 66 | Kimi K2 Thinking (thinking-turbo)Moonshot AI | 1,430 | +/-3 | 52,935 | Moonshot AI | Modified MIT |
| 67 | amazon-nova-experimental-chat-26-02-10Amazon | 1,428 | +/-10 | 3,424 | Amazon | Proprietary |
| 68 | GPT-5OpenAI | 1,426 | +/-4 | 31,617 | OpenAI | Proprietary |
| 69 | GLM-4.6智谱AI | 1,426 | +/-4 | 35,694 | 智谱AI | MIT |
| 70 | DeepSeek V3.2-Exp (thinking)DeepSeek-AI | 1,425 | +/-7 | 9,076 | DeepSeek-AI | MIT |
| 71 | DeepSeek V3.2DeepSeek-AI | 1,424 | +/-4 | 44,820 | DeepSeek-AI | MIT |
| 72 | qwen3-max-2025-09-23Alibaba | 1,424 | +/-6 | 9,179 | Alibaba | Proprietary |
| 73 | Claude Opus 4 (thinking-16k)Anthropic | 1,424 | +/-4 | 36,937 | Anthropic | Proprietary |
| 74 | DeepSeek V3.2-ExpDeepSeek-AI | 1,423 | +/-6 | 11,943 | DeepSeek-AI | MIT |
| 75 | mimo-v2.5Xiaomi | 1,423 | +/-7 | 6,300 | Xiaomi | MIT |
| 76 | Qwen3-235B-A22B-2507阿里巴巴 | 1,423 | +/-3 | 88,518 | 阿里巴巴 | Apache 2.0 |
| 77 | DeepSeek V3.2 (thinking)DeepSeek-AI | 1,422 | +/-4 | 39,071 | DeepSeek-AI | MIT |
| 78 | DeepSeek-R1-0528DeepSeek-AI | 1,422 | +/-6 | 18,469 | DeepSeek-AI | MIT |
| 79 | 1,421 | +/-8 | 6,823 | xAI | Proprietary | |
| 80 | ERNIE 5.0百度 | 1,419 | +/-9 | 4,715 | 百度 | Proprietary |
| 81 | Qwen3.5-122B-A10B阿里巴巴 | 1,418 | +/-5 | 19,379 | 阿里巴巴 | Apache 2.0 |
| 82 | hunyuan-hy3-previewTencent | 1,418 | +/-8 | 4,582 | Tencent | tencent-hunyuan-community |
| 83 | Kimi K2 0905Moonshot AI | 1,418 | +/-6 | 11,798 | Moonshot AI | Modified MIT |
| 84 | DeepSeek-V3.1DeepSeek-AI | 1,418 | +/-6 | 14,985 | DeepSeek-AI | MIT |
| 85 | Kimi K2Moonshot AI | 1,417 | +/-5 | 27,644 | Moonshot AI | Modified MIT |
| 86 | deepseek-v3.1-terminus-thinkingDeepSeek | 1,417 | +/-10 | 3,474 | DeepSeek | MIT |
| 87 | DeepSeek-V3.1 (thinking)DeepSeek-AI | 1,417 | +/-7 | 11,754 | DeepSeek-AI | MIT |
| 88 | DeepSeek-V3.1 TerminusDeepSeek-AI | 1,416 | +/-10 | 3,713 | DeepSeek-AI | MIT |
| 89 | Qwen3-VL-235B-A22B-Instruct阿里巴巴 | 1,415 | +/-6 | 11,529 | 阿里巴巴 | Apache 2.0 |
| 90 | amazon-nova-experimental-chat-26-01-10Amazon | 1,415 | +/-10 | 3,418 | Amazon | Proprietary |
| 91 | Mistral Large 3MistralAI | 1,415 | +/-4 | 41,365 | MistralAI | Apache 2.0 |
| 92 | GPT-4.1OpenAI | 1,413 | +/-4 | 51,035 | OpenAI | Proprietary |
| 93 | Claude Opus 4Anthropic | 1,412 | +/-4 | 44,244 | Anthropic | Proprietary |
| 94 | 1,412 | +/-4 | 32,916 | xAI | Proprietary | |
| 95 | GLM-4.5智谱AI | 1,411 | +/-5 | 24,336 | 智谱AI | MIT |
| 96 | Gemini 2.5 FlashGoogle Deep Mind | 1,411 | +/-3 | 114,591 | Google Deep Mind | Proprietary |
| 97 | 1,410 | +/-4 | 41,416 | xAI | Proprietary | |
| 98 | Magistral-Medium-2506MistralAI | 1,409 | +/-3 | 84,463 | MistralAI | Proprietary |
| 99 | Haiku 4.5Anthropic | 1,409 | +/-3 | 67,007 | Anthropic | Proprietary |
| 100 | 1,407 | +/-6 | 13,525 | MiniMaxAI | Modified MIT | |
| 101 | Qwen3.5-27B阿里巴巴 | 1,406 | +/-5 | 18,942 | 阿里巴巴 | Apache 2.0 |
| 102 | gpt-5.4-nano-highOpenAI | 1,406 | +/-5 | 14,363 | OpenAI | Proprietary |
| 103 | Gemini 2.5 Flash-Preview-09-2025Google Deep Mind | 1,405 | +/-4 | 32,938 | Google Deep Mind | Proprietary |
| 104 | 1,404 | +/-5 | 18,737 | xAI | Proprietary | |
| 105 | qwen3-235b-a22b-no-thinkingAlibaba | 1,403 | +/-5 | 38,241 | Alibaba | Apache 2.0 |
| 106 | Qwen3-Next阿里巴巴 | 1,402 | +/-5 | 22,883 | 阿里巴巴 | Apache 2.0 |
| 107 | o1-2024-12-17OpenAI | 1,402 | +/-4 | 27,807 | OpenAI | Proprietary |
| 108 | longcat-flash-chatMeituan | 1,401 | +/-6 | 11,409 | Meituan | MIT |
| 109 | qwen3-235b-a22b-thinking-2507Alibaba | 1,399 | +/-7 | 9,004 | Alibaba | Apache 2.0 |
| 110 | Claude Sonnet 4 (thinking-32k)Anthropic | 1,399 | +/-4 | 35,132 | Anthropic | Proprietary |
| 111 | Step 3.5 FlashStepFunAI | 1,398 | +/-5 | 19,649 | StepFunAI | Proprietary |
| 112 | DeepSeek-R1DeepSeek-AI | 1,398 | +/-5 | 18,524 | DeepSeek-AI | MIT |
| 113 | Qwen3.5-35B-A3B阿里巴巴 | 1,397 | +/-5 | 19,774 | 阿里巴巴 | Apache 2.0 |
| 114 | hunyuan-vision-1.5-thinkingTencent | 1,396 | +/-12 | 2,221 | Tencent | Proprietary |
| 115 | Qwen3-VL-235B-A22B-Instruct (thinking)阿里巴巴 | 1,396 | +/-7 | 7,944 | 阿里巴巴 | Apache 2.0 |
| 116 | amazon-nova-experimental-chat-12-10Amazon | 1,395 | +/-10 | 3,690 | Amazon | Proprietary |
| 117 | DeepSeek-V3-0324DeepSeek-AI | 1,395 | +/-4 | 45,533 | DeepSeek-AI | MIT |
| 118 | 1,395 | +/-4 | 24,885 | MiniMaxAI | Modified MIT | |
| 119 | Step 3.5 FlashStepFunAI | 1,393 | +/-4 | 25,112 | StepFunAI | Apache 2.0 |
| 120 | mimo-v2-flash (non-thinking)Xiaomi | 1,393 | +/-4 | 37,247 | Xiaomi | MIT |
| 121 | mai-1-previewMicrosoft AI | 1,393 | +/-5 | 17,899 | Microsoft AI | Proprietary |
| 122 | gpt-5-mini-highOpenAI | 1,390 | +/-5 | 27,053 | OpenAI | Proprietary |
| 123 | OpenAI o4 - miniOpenAI | 1,390 | +/-4 | 45,463 | OpenAI | Proprietary |
| 124 | Claude Sonnet 4Anthropic | 1,389 | +/-4 | 40,351 | Anthropic | Proprietary |
| 125 | OpenAI o1OpenAI | 1,388 | +/-5 | 31,122 | OpenAI | Proprietary |
| 126 | mimo-v2-flash (thinking)Xiaomi | 1,388 | +/-6 | 10,982 | Xiaomi | MIT |
| 127 | Hunyuan-T1腾讯AI实验室 | 1,387 | +/-9 | 4,710 | 腾讯AI实验室 | Proprietary |
| 128 | Qwen3-Coder-480B-A35B阿里巴巴 | 1,387 | +/-5 | 25,757 | 阿里巴巴 | Apache 2.0 |
| 129 | Claude Sonnet 3.7 (thinking-32k)Anthropic | 1,387 | +/-4 | 38,841 | Anthropic | Proprietary |
| 130 | mistral-medium-2505Mistral | 1,386 | +/-5 | 33,244 | Mistral | Proprietary |
| 131 | 1,385 | +/-5 | 17,165 | MiniMaxAI | MIT | |
| 132 | Qwen3-30B-A3B-2507阿里巴巴 | 1,383 | +/-5 | 23,766 | 阿里巴巴 | Apache 2.0 |
| 133 | GPT-4.1 miniOpenAI | 1,382 | +/-4 | 39,353 | OpenAI | Proprietary |
| 134 | hunyuan-turbos-20250416Tencent | 1,382 | +/-6 | 10,723 | Tencent | Proprietary |
| 135 | trinity-large-thinkingArcee AI | 1,380 | +/-6 | 12,239 | Arcee AI | Apache 2.0 |
| 136 | Gemini 2.5 Flash-Lite-Preview-09-2025 (no-thinking)Google Deep Mind | 1,380 | +/-3 | 47,285 | Google Deep Mind | Proprietary |
| 137 | GLM-4.6V智谱AI | 1,378 | +/-11 | 2,810 | 智谱AI | MIT |
| 138 | trinity-large-previewArcee AI | 1,375 | +/-5 | 20,978 | Arcee AI | Apache 2.0 |
| 139 | Qwen3-235B-A22B阿里巴巴 | 1,375 | +/-5 | 26,284 | 阿里巴巴 | Apache 2.0 |
| 140 | Gemini 2.5 Flash-Lite (thinking)Google Deep Mind | 1,374 | +/-5 | 32,947 | Google Deep Mind | Proprietary |
| 141 | Qwen2.5-Max阿里巴巴 | 1,374 | +/-4 | 32,625 | 阿里巴巴 | Proprietary |
| 142 | GLM-4.5-Air智谱AI | 1,373 | +/-4 | 31,119 | 智谱AI | MIT |
| 143 | Claude 3.5 SonnetAnthropic | 1,372 | +/-3 | 88,359 | Anthropic | Proprietary |
| 144 | Claude Sonnet 3.7Anthropic | 1,371 | +/-4 | 43,206 | Anthropic | Proprietary |
| 145 | Qwen3-Next (thinking)阿里巴巴 | 1,369 | +/-6 | 13,707 | 阿里巴巴 | Apache 2.0 |
| 146 | GLM-4.7-Flash智谱AI | 1,368 | +/-6 | 11,763 | 智谱AI | MIT |
| 147 | amazon-nova-experimental-chat-11-10Amazon | 1,367 | +/-4 | 25,445 | Amazon | Proprietary |
| 148 | Gemma 3 - 27B (IT)Google Deep Mind | 1,366 | +/-4 | 47,569 | Google Deep Mind | Gemma |
| 149 | minimax-m1MiniMax | 1,363 | +/-4 | 35,233 | MiniMax | Apache 2.0 |
| 150 | o3-mini-highOpenAI | 1,363 | +/-5 | 18,589 | OpenAI | Proprietary |
| 151 | OpenAI o3-mini (high)OpenAI | 1,362 | +/-5 | 16,977 | OpenAI | Proprietary |
| 152 | nvidia-nemotron-3-super-120b-a12bNvidia | 1,361 | +/-7 | 7,419 | Nvidia | NVIDIA Open Model |
| 153 | Gemini 2.0 Flash ExperimentalDeepMind | 1,360 | +/-4 | 43,767 | DeepMind | Proprietary |
| 154 | DeepSeek-V3DeepSeek-AI | 1,358 | +/-5 | 21,770 | DeepSeek-AI | DeepSeek |
| 155 | Mistral-Small-3.2MistralAI | 1,357 | +/-5 | 17,716 | MistralAI | Apache 2.0 |
| 156 | 1,357 | +/-5 | 22,724 | xAI | Proprietary | |
| 157 | intellect-3Prime Intellect | 1,357 | +/-8 | 5,337 | Prime Intellect | MIT |
| 158 | C4AI Command A (202503)CohereAI | 1,353 | +/-3 | 56,304 | CohereAI | CC-BY-NC-4.0 |
| 159 | GLM-4.5V智谱AI | 1,353 | +/-8 | 4,965 | 智谱AI | MIT |
| 160 | Gemini 2.0 Flash-LiteDeepMind | 1,353 | +/-4 | 24,955 | DeepMind | Proprietary |
| 161 | GPT OSS 120BOpenAI | 1,353 | +/-4 | 30,653 | OpenAI | Apache 2.0 |
| 162 | Gemini 1.5 ProGoogle Deep Mind | 1,351 | +/-3 | 55,606 | Google Deep Mind | Proprietary |
| 163 | amazon-nova-experimental-chat-10-20Amazon | 1,350 | +/-6 | 11,479 | Amazon | Proprietary |
| 164 | hunyuan-turbos-20250226Tencent | 1,348 | +/-12 | 2,220 | Tencent | Proprietary |
| 165 | Step3StepFunAI | 1,348 | +/-7 | 6,551 | StepFunAI | Apache 2.0 |
| 166 | amazon-nova-experimental-chat-10-09Amazon | 1,348 | +/-11 | 2,839 | Amazon | Proprietary |
| 167 | o3-miniOpenAI | 1,347 | +/-4 | 57,364 | OpenAI | Proprietary |
| 168 | Qwen3-32B阿里巴巴 | 1,347 | +/-9 | 3,926 | 阿里巴巴 | Apache 2.0 |
| 169 | llama-3.1-nemotron-ultra-253b-v1Nvidia | 1,347 | +/-12 | 2,549 | Nvidia | Nvidia Open Model |
| 170 | mercury-2Inception AI | 1,347 | +/-11 | 3,135 | Inception AI | Proprietary |
| 171 | ling-flash-2.0InclusionAI | 1,346 | +/-7 | 7,015 | InclusionAI | MIT |
| 172 | 1,346 | +/-8 | 6,871 | MiniMaxAI | Apache 2.0 | |
| 173 | qwen-plus-0125Alibaba | 1,346 | +/-8 | 5,819 | Alibaba | Proprietary |
| 174 | GPT-4oOpenAI | 1,345 | +/-3 | 112,881 | OpenAI | Proprietary |
| 175 | nvidia-llama-3.3-nemotron-super-49b-v1.5Nvidia | 1,343 | +/-10 | 3,346 | Nvidia | Nvidia Open |
| 176 | glm-4-plus-0111Zhipu | 1,343 | +/-8 | 5,760 | Zhipu | Proprietary |
| 177 | Claude 3.5 SonnetAnthropic | 1,342 | +/-3 | 82,419 | Anthropic | Proprietary |
| 178 | Gemma 3 - 12B (IT)Google Deep Mind | 1,342 | +/-10 | 3,829 | Google Deep Mind | Gemma |
| 179 | hunyuan-turbo-0110Tencent | 1,340 | +/-12 | 2,290 | Tencent | Proprietary |
| 180 | gpt-5-nano-highOpenAI | 1,337 | +/-7 | 8,274 | OpenAI | Proprietary |
| 181 | Nova 2 Lite亚马逊 | 1,337 | +/-6 | 12,250 | 亚马逊 | Proprietary |
| 182 | OpenAI o1-miniOpenAI | 1,337 | +/-4 | 51,981 | OpenAI | Proprietary |
| 183 | QwQ-32B阿里巴巴 | 1,336 | +/-4 | 25,401 | 阿里巴巴 | Apache 2.0 |
| 184 | 1,335 | +/-4 | 63,498 | xAI | Proprietary | |
| 185 | gemini-advanced-0514Google | 1,335 | +/-5 | 50,148 | Proprietary | |
| 186 | GPT-4oOpenAI | 1,334 | +/-4 | 45,499 | OpenAI | Proprietary |
| 187 | llama-3.1-405b-instruct-bf16Meta | 1,334 | +/-4 | 41,375 | Meta | Llama 3.1 Community |
| 188 | step-2-16k-exp-202412StepFun | 1,334 | +/-9 | 4,833 | StepFun | Proprietary |
| 189 | llama-3.1-405b-instruct-fp8Meta | 1,333 | +/-4 | 59,656 | Meta | Llama 3.1 Community |
| 190 | olmo-3.1-32b-instructAi2 | 1,330 | +/-6 | 12,240 | Ai2 | Apache 2.0 |
| 191 | molmo-2-8bAi2 | 1,329 | +/-21 | 802 | Ai2 | Apache 2.0 |
| 192 | yi-lightning01 AI | 1,328 | +/-5 | 27,332 | 01 AI | Proprietary |
| 193 | llama-3.3-nemotron-49b-super-v1Nvidia | 1,327 | +/-12 | 2,218 | Nvidia | Nvidia |
| 194 | Qwen3-30B-A3B阿里巴巴 | 1,327 | +/-5 | 26,502 | 阿里巴巴 | Apache 2.0 |
| 195 | llama-4-maverick-17b-128e-instructMeta | 1,327 | +/-4 | 39,996 | Meta | Llama 4 |
| 196 | hunyuan-large-2025-02-10Tencent | 1,326 | +/-10 | 3,738 | Tencent | Proprietary |
| 197 | gpt-4-turbo-2024-04-09OpenAI | 1,324 | +/-4 | 98,114 | OpenAI | Proprietary |
| 198 | deepseek-v2.5-1210DeepSeek | 1,323 | +/-8 | 6,795 | DeepSeek | DeepSeek |
| 199 | Claude 3.5 HaikuAnthropic | 1,323 | +/-3 | 70,017 | Anthropic | Proprietary |
| 200 | Gemini 1.5 ProGoogle Deep Mind | 1,323 | +/-4 | 79,138 | Google Deep Mind | Proprietary |
| 201 | llama-4-scout-17b-16e-instructMeta | 1,322 | +/-5 | 30,310 | Meta | Llama |
| 202 | gpt-4.1-nano-2025-04-14OpenAI | 1,322 | +/-8 | 6,103 | OpenAI | Proprietary |
| 203 | Claude3-OpusAnthropic | 1,321 | +/-3 | 194,909 | Anthropic | Proprietary |
| 204 | ring-flash-2.0InclusionAI | 1,321 | +/-7 | 7,156 | InclusionAI | MIT |
| 205 | step-1o-turbo-202506StepFun | 1,320 | +/-7 | 9,039 | StepFun | Proprietary |
| 206 | glm-4-plusZhipu AI | 1,319 | +/-5 | 26,126 | Zhipu AI | Proprietary |
| 207 | gemma-3n-e4b-itGoogle | 1,318 | +/-5 | 22,610 | Gemma | |
| 208 | llama-3.3-70b-instructMeta | 1,318 | +/-3 | 54,748 | Meta | Llama-3.3 |
| 209 | qwen-max-0919Alibaba | 1,318 | +/-6 | 16,478 | Alibaba | Qwen |
| 210 | gpt-4o-mini-2024-07-18OpenAI | 1,317 | +/-4 | 68,710 | OpenAI | Proprietary |
| 211 | gpt-oss-20bOpenAI | 1,317 | +/-6 | 10,637 | OpenAI | Apache 2.0 |
| 212 | nvidia-nemotron-3-nano-30b-a3b-bf16Nvidia | 1,317 | +/-6 | 15,530 | Nvidia | NVIDIA Open Model |
| 213 | qwen2.5-plus-1127Alibaba | 1,315 | +/-6 | 10,187 | Alibaba | Proprietary |
| 214 | athene-v2-chatNexusFlow | 1,314 | +/-5 | 24,739 | NexusFlow | NexusFlow |
| 215 | mistral-large-2407Mistral | 1,313 | +/-4 | 45,459 | Mistral | Mistral Research |
| 216 | gpt-4-0125-previewOpenAI | 1,312 | +/-4 | 93,439 | OpenAI | Proprietary |
| 217 | gpt-4-1106-previewOpenAI | 1,312 | +/-4 | 100,105 | OpenAI | Proprietary |
| 218 | hunyuan-standard-2025-02-10Tencent | 1,311 | +/-10 | 3,904 | Tencent | Proprietary |
| 219 | gemini-1.5-flash-002Google | 1,309 | +/-4 | 34,902 | Proprietary | |
| 220 | 1,308 | +/-4 | 52,567 | xAI | Proprietary | |
| 221 | deepseek-v2.5DeepSeek | 1,307 | +/-5 | 24,572 | DeepSeek | DeepSeek |
| 222 | mercuryInception AI | 1,306 | +/-14 | 1,958 | Inception AI | Proprietary |
| 223 | athene-70b-0725NexusFlow | 1,306 | +/-6 | 19,621 | NexusFlow | CC-BY-NC-4.0 |
| 224 | olmo-3-32b-thinkAi2 | 1,305 | +/-8 | 5,962 | Ai2 | Apache 2.0 |
| 225 | mistral-large-2411Mistral | 1,305 | +/-4 | 28,073 | Mistral | MRL |
| 226 | magistral-medium-2506Mistral | 1,303 | +/-6 | 11,646 | Mistral | Proprietary |
| 227 | gemma-3-4b-itGoogle | 1,303 | +/-9 | 4,171 | Gemma | |
| 228 | mistral-small-3.1-24b-instruct-2503Mistral | 1,303 | +/-5 | 33,231 | Mistral | Apache 2.0 |
| 229 | qwen2.5-72b-instructAlibaba | 1,302 | +/-4 | 39,406 | Alibaba | Qwen |
| 230 | llama-3.1-nemotron-70b-instructNvidia | 1,299 | +/-8 | 7,140 | Nvidia | Llama 3.1 |
| 231 | hunyuan-large-visionTencent | 1,294 | +/-9 | 5,370 | Tencent | Proprietary |
| 232 | llama-3.1-70b-instructMeta | 1,293 | +/-4 | 55,240 | Meta | Llama 3.1 Community |
| 233 | amazon-nova-pro-v1.0Amazon | 1,290 | +/-5 | 24,745 | Amazon | Proprietary |
| 234 | jamba-1.5-largeAI21 Labs | 1,288 | +/-7 | 8,662 | AI21 Labs | Jamba Open |
| 235 | gemma-2-27b-itGoogle | 1,288 | +/-3 | 75,754 | Gemma license | |
| 236 | reka-core-20240904Reka AI | 1,287 | +/-7 | 7,312 | Reka AI | Proprietary |
| 237 | ibm-granite-h-smallIBM | 1,287 | +/-8 | 5,679 | IBM | Apache 2.0 |
| 238 | gpt-4-0314OpenAI | 1,286 | +/-5 | 54,173 | OpenAI | Proprietary |
| 239 | llama-3.1-tulu-3-70bAi2 | 1,286 | +/-10 | 2,846 | Ai2 | Llama 3.1 |
| 240 | llama-3.1-nemotron-51b-instructNvidia | 1,285 | +/-10 | 3,749 | Nvidia | Llama 3.1 |
| 241 | gemini-1.5-flash-001Google | 1,285 | +/-4 | 62,833 | Proprietary | |
| 242 | olmo-3.1-32b-thinkAi2 | 1,285 | +/-7 | 8,512 | Ai2 | Apache 2.0 |
| 243 | claude-3-sonnet-20240229Anthropic | 1,280 | +/-4 | 109,284 | Anthropic | Proprietary |
| 244 | gemma-2-9b-it-simpoPrinceton | 1,279 | +/-7 | 10,072 | Princeton | MIT |
| 245 | nemotron-4-340b-instructNvidia | 1,276 | +/-5 | 19,659 | Nvidia | NVIDIA Open Model |
| 246 | command-r-plus-08-2024Cohere | 1,276 | +/-7 | 9,866 | Cohere | CC-BY-NC-4.0 |
| 247 | llama-3-70b-instructMeta | 1,275 | +/-4 | 156,876 | Meta | Llama 3 Community |
| 248 | gpt-4-0613OpenAI | 1,274 | +/-4 | 88,723 | OpenAI | Proprietary |
| 249 | mistral-small-24b-instruct-2501Mistral | 1,274 | +/-6 | 14,681 | Mistral | Apache 2.0 |
| 250 | glm-4-0520Zhipu AI | 1,273 | +/-7 | 9,788 | Zhipu AI | Proprietary |
| 251 | reka-flash-20240904Reka AI | 1,271 | +/-7 | 7,536 | Reka AI | Proprietary |
| 252 | qwen2.5-coder-32b-instructAlibaba | 1,270 | +/-8 | 5,432 | Alibaba | Apache 2.0 |
| 253 | c4ai-aya-expanse-32bCohere | 1,266 | +/-5 | 27,124 | Cohere | CC-BY-NC-4.0 |
| 254 | gemma-2-9b-itGoogle | 1,265 | +/-4 | 54,611 | Gemma license | |
| 255 | deepseek-coder-v2DeepSeek | 1,264 | +/-6 | 15,147 | DeepSeek | DeepSeek License |
| 256 | command-r-plusCohere | 1,261 | +/-4 | 77,554 | Cohere | CC-BY-NC-4.0 |
| 257 | qwen2-72b-instructAlibaba | 1,261 | +/-5 | 37,325 | Alibaba | Qianwen LICENSE |
| 258 | claude-3-haiku-20240307Anthropic | 1,260 | +/-4 | 117,701 | Anthropic | Proprietary |
| 259 | amazon-nova-lite-v1.0Amazon | 1,260 | +/-5 | 19,372 | Amazon | Proprietary |
| 260 | gemini-1.5-flash-8b-001Google | 1,258 | +/-4 | 35,558 | Proprietary | |
| 261 | Phi 4 - 14BMicrosoft Azure | 1,256 | +/-5 | 24,126 | Microsoft Azure | MIT |
| 262 | olmo-2-0325-32b-instructAi2 | 1,251 | +/-11 | 3,334 | Ai2 | Apache-2.0 |
| 263 | command-r-08-2024Cohere | 1,249 | +/-7 | 10,140 | Cohere | CC-BY-NC-4.0 |
| 264 | mistral-large-2402Mistral | 1,241 | +/-5 | 62,436 | Mistral | Proprietary |
| 265 | amazon-nova-micro-v1.0Amazon | 1,240 | +/-5 | 19,364 | Amazon | Proprietary |
| 266 | jamba-1.5-miniAI21 Labs | 1,239 | +/-7 | 8,858 | AI21 Labs | Jamba Open |
| 267 | ministral-8b-2410Mistral | 1,237 | +/-9 | 4,781 | Mistral | MRL |
| 268 | gemini-pro-dev-apiGoogle | 1,235 | +/-7 | 18,354 | Proprietary | |
| 269 | qwen1.5-110b-chatAlibaba | 1,233 | +/-6 | 26,195 | Alibaba | Qianwen LICENSE |
| 270 | hunyuan-standard-256kTencent | 1,233 | +/-12 | 2,728 | Tencent | Proprietary |
| 271 | reka-flash-21b-20240226-onlineReka AI | 1,232 | +/-7 | 15,450 | Reka AI | Proprietary |
| 272 | qwen1.5-72b-chatAlibaba | 1,232 | +/-5 | 39,302 | Alibaba | Qianwen LICENSE |
| 273 | mixtral-8x22b-instruct-v0.1Mistral | 1,228 | +/-5 | 51,416 | Mistral | Apache 2.0 |
| 274 | command-rCohere | 1,226 | +/-5 | 54,036 | Cohere | CC-BY-NC-4.0 |
| 275 | reka-flash-21b-20240226Reka AI | 1,226 | +/-6 | 24,806 | Reka AI | Proprietary |
| 276 | gpt-3.5-turbo-0125OpenAI | 1,223 | +/-5 | 66,207 | OpenAI | Proprietary |
| 277 | llama-3-8b-instructMeta | 1,222 | +/-4 | 104,642 | Meta | Llama 3 Community |
| 278 | c4ai-aya-expanse-8bCohere | 1,222 | +/-7 | 9,818 | Cohere | CC-BY-NC-4.0 |
| 279 | mistral-mediumMistral | 1,222 | +/-6 | 34,550 | Mistral | Proprietary |
| 280 | gemini-proGoogle | 1,221 | +/-12 | 6,390 | Proprietary | |
| 281 | llama-3.1-tulu-3-8bAi2 | 1,220 | +/-11 | 2,896 | Ai2 | Llama 3.1 |
| 282 | yi-1.5-34b-chat01 AI | 1,212 | +/-5 | 24,146 | 01 AI | Apache-2.0 |
| 283 | zephyr-orpo-141b-A35b-v0.1HuggingFace | 1,212 | +/-11 | 4,652 | HuggingFace | Apache 2.0 |
| 284 | llama-3.1-8b-instructMeta | 1,211 | +/-4 | 49,605 | Meta | Llama 3.1 Community |
| 285 | granite-3.1-8b-instructIBM | 1,207 | +/-11 | 3,090 | IBM | Apache 2.0 |
| 286 | qwen1.5-32b-chatAlibaba | 1,203 | +/-6 | 21,741 | Alibaba | Qianwen LICENSE |
| 287 | gpt-3.5-turbo-1106OpenAI | 1,202 | +/-9 | 16,619 | OpenAI | Proprietary |
| 288 | gemma-2-2b-itGoogle | 1,199 | +/-4 | 46,616 | Gemma license | |
| 289 | phi-3-medium-4k-instructMicrosoft | 1,197 | +/-5 | 25,055 | Microsoft | MIT |
| 290 | mixtral-8x7b-instruct-v0.1Mistral | 1,196 | +/-4 | 73,503 | Mistral | Apache 2.0 |
| 291 | dbrx-instruct-previewDatabricks | 1,194 | +/-6 | 32,191 | Databricks | DBRX LICENSE |
| 292 | internlm2_5-20b-chatInternLM | 1,191 | +/-7 | 9,901 | InternLM | Other |
| 293 | qwen1.5-14b-chatAlibaba | 1,190 | +/-7 | 17,839 | Alibaba | Qianwen LICENSE |
| 294 | wizardlm-70bMicrosoft | 1,184 | +/-9 | 8,214 | Microsoft | Llama 2 Community |
| 295 | deepseek-llm-67b-chatDeepSeek | 1,183 | +/-12 | 4,932 | DeepSeek | DeepSeek License |
| 296 | yi-34b-chat01 AI | 1,183 | +/-7 | 15,483 | 01 AI | Yi License |
| 297 | openchat-3.5-0106OpenChat | 1,181 | +/-8 | 12,637 | OpenChat | Apache-2.0 |
| 298 | openchat-3.5OpenChat | 1,181 | +/-10 | 7,968 | OpenChat | Apache-2.0 |
| 299 | granite-3.0-8b-instructIBM | 1,181 | +/-9 | 6,638 | IBM | Apache 2.0 |
| 300 | gemma-1.1-7b-itGoogle | 1,180 | +/-6 | 23,893 | Gemma license | |
| 301 | snowflake-arctic-instructSnowflake | 1,178 | +/-6 | 32,832 | Snowflake | Apache 2.0 |
| 302 | granite-3.1-2b-instructIBM | 1,178 | +/-11 | 3,188 | IBM | Apache 2.0 |
| 303 | tulu-2-dpo-70bAllenAI/UW | 1,177 | +/-10 | 6,535 | AllenAI/UW | AI2 ImpACT Low-risk |
| 304 | openhermes-2.5-mistral-7bNousResearch | 1,174 | +/-10 | 5,006 | NousResearch | Apache-2.0 |
| 305 | vicuna-33bLMSYS | 1,172 | +/-6 | 22,479 | LMSYS | Non-commercial |
| 306 | starling-lm-7b-betaNexusflow | 1,171 | +/-7 | 16,056 | Nexusflow | Apache-2.0 |
| 307 | phi-3-small-8k-instructMicrosoft | 1,170 | +/-6 | 17,766 | Microsoft | MIT |
| 308 | llama-2-70b-chatMeta | 1,170 | +/-6 | 38,492 | Meta | Llama 2 Community |
| 309 | starling-lm-7b-alphaUC Berkeley | 1,166 | +/-8 | 10,224 | UC Berkeley | CC-BY-NC-4.0 |
| 310 | llama-3.2-3b-instructMeta | 1,166 | +/-8 | 7,936 | Meta | Llama 3.2 |
| 311 | nous-hermes-2-mixtral-8x7b-dpoNousResearch | 1,164 | +/-12 | 3,777 | NousResearch | Apache-2.0 |
| 312 | qwq-32b-previewAlibaba | 1,156 | +/-12 | 3,231 | Alibaba | Apache 2.0 |
| 313 | granite-3.0-2b-instructIBM | 1,155 | +/-8 | 6,837 | IBM | Apache 2.0 |
| 314 | llama2-70b-steerlm-chatNvidia | 1,154 | +/-13 | 3,585 | Nvidia | Llama 2 Community |
| 315 | solar-10.7b-instruct-v1.0Upstage AI | 1,151 | +/-13 | 4,155 | Upstage AI | CC-BY-NC-4.0 |
| 316 | dolphin-2.2.1-mistral-7bCognitive Computations | 1,151 | +/-15 | 1,679 | Cognitive Computations | Apache-2.0 |
| 317 | mpt-30b-chatMosaicML | 1,149 | +/-12 | 2,572 | MosaicML | CC-BY-NC-SA-4.0 |
| 318 | mistral-7b-instruct-v0.2Mistral | 1,148 | +/-7 | 19,402 | Mistral | Apache-2.0 |
| 319 | wizardlm-13bMicrosoft | 1,148 | +/-9 | 7,044 | Microsoft | Llama 2 Community |
| 320 | falcon-180b-chatTII | 1,146 | +/-17 | 1,295 | TII | Falcon-180B TII License |
| 321 | qwen1.5-7b-chatAlibaba | 1,143 | +/-10 | 4,737 | Alibaba | Qianwen LICENSE |
| 322 | phi-3-mini-4k-instruct-june-2024Microsoft | 1,142 | +/-6 | 12,297 | Microsoft | MIT |
| 323 | llama-2-13b-chatMeta | 1,141 | +/-7 | 19,174 | Meta | Llama 2 Community |
| 324 | vicuna-13bLMSYS | 1,140 | +/-7 | 19,367 | LMSYS | Llama 2 Community |
| 325 | qwen-14b-chatAlibaba | 1,137 | +/-11 | 4,964 | Alibaba | Qianwen LICENSE |
| 326 | palm-2Google | 1,136 | +/-9 | 8,554 | Proprietary | |
| 327 | gemma-7b-itGoogle | 1,136 | +/-10 | 8,925 | Gemma license | |
| 328 | codellama-34b-instructMeta | 1,136 | +/-9 | 7,366 | Meta | Llama 2 Community |
| 329 | zephyr-7b-betaHuggingFace | 1,130 | +/-9 | 11,118 | HuggingFace | MIT |
| 330 | phi-3-mini-128k-instructMicrosoft | 1,128 | +/-7 | 20,685 | Microsoft | MIT |
| 331 | phi-3-mini-4k-instructMicrosoft | 1,127 | +/-6 | 20,118 | Microsoft | MIT |
| 332 | guanaco-33bUW | 1,126 | +/-12 | 2,921 | UW | Non-commercial |
| 333 | zephyr-7b-alphaHuggingFace | 1,126 | +/-16 | 1,785 | HuggingFace | MIT |
| 334 | stripedhyena-nous-7bTogether AI | 1,120 | +/-11 | 5,182 | Together AI | Apache 2.0 |
| 335 | codellama-70b-instructMeta | 1,118 | +/-18 | 1,143 | Meta | Llama 2 Community |
| 336 | gemma-1.1-2b-itGoogle | 1,114 | +/-8 | 10,854 | Gemma license | |
| 337 | vicuna-7bLMSYS | 1,114 | +/-9 | 6,923 | LMSYS | Llama 2 Community |
| 338 | smollm2-1.7b-instructHuggingFace | 1,113 | +/-14 | 2,199 | HuggingFace | Apache 2.0 |
| 339 | llama-3.2-1b-instructMeta | 1,110 | +/-8 | 8,045 | Meta | Llama 3.2 |
| 340 | mistral-7b-instructMistral | 1,109 | +/-9 | 8,977 | Mistral | Apache 2.0 |
| 341 | llama-2-7b-chatMeta | 1,107 | +/-7 | 14,148 | Meta | Llama 2 Community |
| 342 | gemma-2b-itGoogle | 1,091 | +/-12 | 4,780 | Gemma license | |
| 343 | qwen1.5-4b-chatAlibaba | 1,089 | +/-9 | 7,597 | Alibaba | Qianwen LICENSE |
| 344 | olmo-7b-instructAi2 | 1,073 | +/-11 | 6,328 | Ai2 | Apache-2.0 |
| 345 | koala-13bUC Berkeley | 1,069 | +/-10 | 6,965 | UC Berkeley | Non-commercial |
| 346 | alpaca-13bStanford | 1,067 | +/-11 | 5,745 | Stanford | Non-commercial |
| 347 | gpt4all-13b-snoozyNomic AI | 1,065 | +/-15 | 1,743 | Nomic AI | Non-commercial |
| 348 | mpt-7b-chatMosaicML | 1,061 | +/-12 | 3,924 | MosaicML | CC-BY-NC-SA-4.0 |
| 349 | chatglm3-6bTsinghua | 1,055 | +/-12 | 4,658 | Tsinghua | Apache-2.0 |
| 350 | RWKV-4-Raven-14BRWKV | 1,040 | +/-11 | 4,845 | RWKV | Apache 2.0 |
| 351 | chatglm2-6bTsinghua | 1,023 | +/-14 | 2,658 | Tsinghua | Apache-2.0 |
| 352 | oasst-pythia-12bOpenAssistant | 1,021 | +/-11 | 6,310 | OpenAssistant | Apache 2.0 |
| 353 | chatglm-6bTsinghua | 994 | +/-13 | 4,914 | Tsinghua | Non-commercial |
| 354 | fastchat-t5-3bLMSYS | 990 | +/-12 | 4,203 | LMSYS | Apache 2.0 |
| 355 | dolly-v2-12bDatabricks | 979 | +/-14 | 3,412 | Databricks | MIT |
| 356 | llama-13bMeta | 972 | +/-16 | 2,391 | Meta | Non-commercial |
| 357 | stablelm-tuned-alpha-7bStability AI | 952 | +/-13 | 3,287 | Stability AI | CC-BY-NC-SA-4.0 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
FAQ
What is Text Generation Arena (LMArena)?
Text Generation Arena, formerly LMSYS Chatbot Arena, is one of the most widely followed anonymous LLM evaluation platforms. Users compare answers from two hidden models and vote for the better response; Elo-style scoring aggregates those votes into a dynamic leaderboard.
How is the Arena Elo score calculated?
Arena Elo is adapted from chess rating systems. After each head-to-head comparison, the preferred model gains rating points and the other model loses points, with the size of the change depending on the rating gap. The 95% confidence interval reflects how much comparison data supports the estimate.
Why do some models have both Thinking and regular versions?
Some models offer an extended-thinking mode that spends more inference time reasoning before producing the final answer. This can improve scores on reasoning, math, and coding tasks, but usually increases latency and cost, so Arena tracks these variants separately.
How should I choose an LLM from this leaderboard?
Consider overall Elo, cost, language coverage, open-source availability, and latency. The top-ranked model is not always the best fit for every workflow.
















