加载中...
加载中...
基于 Text Generation Arena 用户匿名投票的最新AI文本生成模型排行榜,涵盖各模型的 Elo 得分、95% 置信区间、投票量、机构与许可证。
Top Model
claude-opus-4-6-thinking
Top Score
1,507
Model Count
60
Data version
2026年02月16日
Data source: LM Arena
This leaderboard ranks the strongest AI models for text generation. Data comes from LMArena (formerly LMSYS Chatbot Arena), the world's largest crowdsourced AI evaluation platform. Users chat with two anonymous models side-by-side and vote for the better response — rankings are determined entirely by real user preferences, not lab benchmarks.
Blind testing: Users chat with two anonymous models and vote based on response quality, eliminating brand bias.
Elo scoring: Using the Bradley-Terry model (adapted from chess Elo ratings) to calculate each model's strength score from battle outcomes. Higher scores mean users more frequently prefer that model.
Broad scenario coverage: Testing spans coding, creative writing, math reasoning, Q&A, role-playing, and more.
DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.
Chart Source: DataLearnerAI · Data Source: LMArena
| Rank | Model | Score | 95% CI | Votes | Organization | License |
|---|---|---|---|---|---|---|
| 1 | claude-opus-4-6-thinking | 1,507 | +9 | 4,650 | Anthropic | Proprietary |
| 2 | claude-opus-4-6 | 1,504 | +8 | 5,427 | Anthropic | Proprietary |
| 3 | gemini-3-pro | 1,486 | +4 | 36,238 | Proprietary | |
| 4 | grok-4.1-thinking | 1,475 | +4 | 35,770 | xAI | Proprietary |
| 5 | gemini-3-flash | 1,473 | +5 | 26,986 | Proprietary | |
| 6 | dola-seed-2.0-preview | 1,473 | +10 | 3,154 | Bytedance | Proprietary |
| 7 | claude-opus-4-5-20251101-thinking-32k | 1,471 | +5 | 28,374 | Anthropic | Proprietary |
| 8 | claude-opus-4-5-20251101 | 1,467 | +4 | 33,214 | Anthropic | Proprietary |
| 9 | grok-4.1 | 1,463 | +4 | 39,883 | xAI | Proprietary |
| 10 | gemini-3-flash (thinking-minimal) | 1,462 | +5 | 18,355 | Proprietary | |
| 11 | gpt-5.1-high | 1,458 | +4 | 32,297 | OpenAI | Proprietary |
| 12 | glm-5 | 1,455 | +9 | 4,643 | Zai | MIT |
| 13 | ernie-5.0-0110 | 1,453 | +6 | 11,982 | Baidu | Proprietary |
| 14 | claude-sonnet-4-5-20250929-thinking-32k | 1,450 | +4 | 46,773 | Anthropic | Proprietary |
| 15 | claude-sonnet-4-5-20250929 | 1,450 | +4 | 44,565 | Anthropic | Proprietary |
| 16 | gemini-2.5-pro | 1,449 | +3 | 95,526 | Proprietary | |
| 17 | ernie-5.0-preview-1203 | 1,449 | +7 | 9,744 | Baidu | Proprietary |
| 18 | claude-opus-4-1-20250805-thinking-16k | 1,449 | +4 | 49,819 | Anthropic | Proprietary |
| 19 | kimi-k2.5-thinking | 1,448 | +7 | 9,050 | Moonshot | Modified MIT |
| 20 | claude-opus-4-1-20250805 | 1,445 | +3 | 75,773 | Anthropic | Proprietary |
| 21 | gpt-4.5-preview-2025-02-27 | 1,444 | +6 | 14,549 | OpenAI | Proprietary |
| 22 | chatgpt-4o-latest-20250326 | 1,442 | +3 | 83,193 | OpenAI | Proprietary |
| 23 | glm-4.7 | 1,441 | +6 | 11,971 | Zai | MIT |
| 24 | gpt-5.2-high | 1,438 | +6 | 17,088 | OpenAI | Proprietary |
| 25 | kimi-k2.5-instant | 1,438 | +9 | 5,007 | Moonshot | Modified MIT |
| 26 | gpt-5.2 | 1,438 | +6 | 13,795 | OpenAI | Proprietary |
| 27 | gpt-5.1 | 1,437 | +4 | 34,522 | OpenAI | Proprietary |
| 28 | gpt-5-high | 1,434 | +5 | 32,559 | OpenAI | Proprietary |
| 29 | qwen3-max-preview | 1,434 | +5 | 27,763 | Alibaba | Proprietary |
| 30 | o3-2025-04-16 | 1,432 | +4 | 61,272 | OpenAI | Proprietary |
| 31 | grok-4.1-fast-reasoning | 1,431 | +4 | 29,040 | xAI | Proprietary |
| 32 | kimi-k2-thinking-turbo | 1,429 | +4 | 34,127 | Moonshot | Modified MIT |
| 33 | gpt-5-chat | 1,426 | +4 | 31,753 | OpenAI | Proprietary |
| 34 | glm-4.6 | 1,425 | +4 | 35,242 | Zai | MIT |
| 35 | qwen3-max-2025-09-23 | 1,425 | +6 | 9,203 | Alibaba | Proprietary |
| 36 | claude-opus-4-20250514-thinking-16k | 1,424 | +4 | 37,930 | Anthropic | Proprietary |
| 37 | deepseek-v3.2-exp-thinking | 1,423 | +7 | 8,981 | DeepSeek | MIT |
| 38 | deepseek-v3.2-exp | 1,423 | +6 | 11,721 | DeepSeek | MIT |
| 39 | qwen3-235b-a22b-instruct-2507 | 1,423 | +3 | 69,847 | Alibaba | Apache 2.0 |
| 40 | grok-4-fast-chat | 1,422 | +8 | 6,983 | xAI | Proprietary |
| 41 | deepseek-v3.2-thinking | 1,420 | +5 | 23,731 | DeepSeek | MIT |
| 42 | deepseek-v3.2 | 1,420 | +5 | 28,747 | DeepSeek | MIT |
| 43 | deepseek-r1-0528 | 1,419 | +6 | 19,281 | DeepSeek | MIT |
| 44 | ernie-5.0-preview-1022 | 1,419 | +9 | 4,594 | Baidu | Proprietary |
| 45 | deepseek-v3.1 | 1,418 | +6 | 15,269 | DeepSeek | MIT |
| 46 | kimi-k2-0905-preview | 1,417 | +6 | 11,959 | Moonshot | Modified MIT |
| 47 | deepseek-v3.1-thinking | 1,417 | +7 | 11,963 | DeepSeek | MIT |
| 48 | kimi-k2-0711-preview | 1,417 | +5 | 28,632 | Moonshot | Modified MIT |
| 49 | deepseek-v3.1-terminus | 1,416 | +10 | 3,757 | DeepSeek | MIT |
| 50 | deepseek-v3.1-terminus-thinking | 1,416 | +10 | 3,547 | DeepSeek | MIT |
| 51 | qwen3-vl-235b-a22b-instruct | 1,415 | +6 | 11,653 | Alibaba | Apache 2.0 |
| 52 | mistral-large-3 | 1,414 | +5 | 24,945 | Mistral | Apache 2.0 |
| 53 | gpt-4.1-2025-04-14 | 1,413 | +4 | 52,121 | OpenAI | Proprietary |
| 54 | claude-opus-4-20250514 | 1,413 | +4 | 45,522 | Anthropic | Proprietary |
| 55 | mistral-medium-2508 | 1,411 | +3 | 63,710 | Mistral | Proprietary |
| 56 | grok-3-preview-02-24 | 1,411 | +4 | 33,966 | xAI | Proprietary |
| 57 | gemini-2.5-flash | 1,411 | +3 | 94,795 | Proprietary | |
| 58 | glm-4.5 | 1,410 | +5 | 24,751 | Zai | MIT |
| 59 | grok-4-0709 | 1,410 | +4 | 41,993 | xAI | Proprietary |
| 60 | claude-haiku-4-5-20251001 | 1,406 | +4 | 45,273 | Anthropic | Proprietary |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.