加载中...
加载中...
LMSys Chatbot Arena采用众包方式对大模型进行匿名评测,通过用户投票形成排名。
📣 数据版本: 20240610
数据来源: LM-SYS官方网站
排名 | 模型名称 | 投票数 | Arena Elo | MT-Bench | MMLU | 发布者 | 开源情况 |
|---|---|---|---|---|---|---|---|
| 1 | GPT-4o-2024-05-13 | 34,985 | 1,287 | / | 88.70 | OpenAI | Proprietary |
| 2 | Gemini-Advanced-0514 | 29,838 | 1,267 | / | / | Proprietary | |
| 2 | Gemini-1.5-Pro-API-0514 | 28,170 | 1,266 | / | 85.90 | Proprietary | |
| 4 | Gemini-1.5-Pro-API-0409-Preview | 55,731 | 1,258 | / | 81.90 | Proprietary | |
| 4 | GPT-4-Turbo-2024-04-09 | 61,122 | 1,256 | / | / | OpenAI | Proprietary |
| 6 | GPT-4-1106-preview | 80,987 | 1,251 | 9.32 | / | OpenAI | Proprietary |
| 6 | Claude 3 Opus | 126,356 | 1,249 | / | 86.80 | Anthropic | Proprietary |
| 6 | GPT-4-0125-preview | 74,232 | 1,246 | / | / | OpenAI | Proprietary |
| 9 | Yi-Large-preview | 36,412 | 1,239 | / | / | 01 AI | Proprietary |
| 10 | Gemini-1.5-Flash-API-0514 | 26,409 | 1,232 | / | 78.90 | Proprietary | |
| 11 | Bard (Gemini Pro) | 11,853 | 1,208 | / | / | Proprietary | |
| 11 | Llama-3-70b-Instruct | 127,901 | 1,208 | / | 82 | Meta | Llama 3 Community |
| 12 | Claude 3 Sonnet | 98,168 | 1,202 | / | 79 | Anthropic | Proprietary |
| 12 | Reka-Core-20240501 | 44,097 | 1,200 | / | 83.20 | Reka AI | Proprietary |
| 15 | Command R+ | 64,622 | 1,189 | / | / | Cohere | CC-BY-NC-4.0 |
| 15 | Qwen2-72B-Instruct | 12,369 | 1,187 | 9.12 | 84.20 | Alibaba | Qianwen LICENSE |
| 15 | GPT-4-0314 | 56,063 | 1,186 | 8.96 | 86.40 | OpenAI | Proprietary |
| 15 | GLM-4-0116 | 7,595 | 1,184 | / | / | Zhipu AI | Proprietary |
| 15 | Qwen-Max-0428 | 24,659 | 1,183 | / | / | Alibaba | Proprietary |
| 18 | Claude 3 Haiku | 88,538 | 1,178 | / | 75.20 | Anthropic | Proprietary |
| 21 | Qwen1.5-110B-Chat | 25,525 | 1,163 | 8.88 | 80.40 | Alibaba | Qianwen LICENSE |
| 21 | GPT-4-0613 | 77,797 | 1,161 | 9.18 | / | OpenAI | Proprietary |
| 21 | Yi-1.5-34B-Chat | 9,891 | 1,161 | / | 76.80 | 01 AI | Apache-2.0 |
| 22 | Mistral-Large-2402 | 56,540 | 1,156 | / | 81.20 | Mistral | Proprietary |
| 21 | Reka-Flash-21B-online | 16,039 | 1,156 | / | / | Reka AI | Proprietary |
| 23 | Llama-3-8b-Instruct | 86,326 | 1,153 | / | 68.40 | Meta | Llama 3 Community |
| 24 | Claude-1 | 21,216 | 1,149 | 7.90 | 77 | Anthropic | Proprietary |
| 26 | Mistral Medium | 35,600 | 1,148 | 8.61 | 75.30 | Mistral | Proprietary |
| 26 | Command R | 47,590 | 1,148 | / | / | Cohere | CC-BY-NC-4.0 |
| 26 | Reka-Flash-21B | 24,537 | 1,148 | / | 73.50 | Reka AI | Proprietary |
| 27 | Qwen1.5-72B-Chat | 40,263 | 1,147 | 8.61 | 77.50 | Alibaba | Qianwen LICENSE |
| 27 | Mixtral-8x22b-Instruct-v0.1 | 37,703 | 1,146 | / | 77.80 | Mistral | Apache 2.0 |
| 33 | Gemini Pro (Dev API) | 18,839 | 1,131 | / | 71.80 | Proprietary | |
| 33 | Claude-2.0 | 12,789 | 1,131 | 8.06 | 78.50 | Anthropic | Proprietary |
| 33 | Zephyr-ORPO-141b-A35b-v0.1 | 4,890 | 1,127 | / | / | HuggingFace | Apache 2.0 |
| 33 | Qwen1.5-32B-Chat | 22,318 | 1,126 | 8.30 | 73.40 | Alibaba | Qianwen LICENSE |
| 33 | Mistral-Next | 12,403 | 1,124 | / | / | Mistral | Proprietary |
| 33 | Phi-3-Medium-4k-Instruct | 10,549 | 1,123 | / | 78 | Microsoft | MIT |
| 35 | Starling-LM-7B-beta | 16,696 | 1,119 | 8.12 | / | Nexusflow | Apache-2.0 |
| 35 | Claude-2.1 | 37,745 | 1,118 | 8.18 | / | Anthropic | Proprietary |
| 36 | GPT-3.5-Turbo-0613 | 39,045 | 1,117 | 8.39 | / | OpenAI | Proprietary |
| 41 | Mixtral-8x7b-Instruct-v0.1 | 67,665 | 1,114 | 8.30 | 70.60 | Mistral | Apache 2.0 |
| 39 | Gemini Pro | 6,581 | 1,111 | / | 71.80 | Proprietary | |
| 42 | Claude-Instant-1 | 20,675 | 1,111 | 7.85 | 73.40 | Anthropic | Proprietary |
| 39 | Yi-34B-Chat | 15,946 | 1,111 | / | 73.50 | 01 AI | Yi License |
| 43 | Qwen1.5-14B-Chat | 18,696 | 1,108 | 7.91 | 67.60 | Alibaba | Qianwen LICENSE |
| 40 | GPT-3.5-Turbo-0314 | 5,670 | 1,106 | 7.94 | 70 | OpenAI | Proprietary |
| 43 | WizardLM-70B-v1.0 | 8,421 | 1,106 | 7.71 | 63.70 | Microsoft | Llama 2 Community |
| 44 | GPT-3.5-Turbo-0125 | 59,237 | 1,103 | / | / | OpenAI | Proprietary |
| 44 | DBRX-Instruct-Preview | 31,752 | 1,102 | / | 73.70 | Databricks | DBRX LICENSE |
| 45 | Phi-3-Small-8k-Instruct | 11,503 | 1,101 | / | 75.70 | Microsoft | MIT |
| 46 | Tulu-2-DPO-70B | 6,674 | 1,099 | 7.89 | / | AllenAI/UW | AI2 ImpACT Low-risk |
| 51 | Llama-2-70b-chat | 39,695 | 1,093 | 6.86 | 63 | Meta | Llama 2 Community |
| 51 | OpenChat-3.5-0106 | 13,010 | 1,091 | 7.80 | 65.80 | OpenChat | Apache-2.0 |
| 52 | Vicuna-33B | 23,001 | 1,090 | 7.12 | 59.20 | LMSYS | Non-commercial |
| 52 | Snowflake Arctic Instruct | 32,189 | 1,090 | / | 67.30 | Snowflake | Apache 2.0 |
| 52 | Starling-LM-7B-alpha | 10,437 | 1,088 | 8.09 | 63.90 | UC Berkeley | CC-BY-NC-4.0 |
| 53 | Nous-Hermes-2-Mixtral-8x7B-DPO | 3,840 | 1,084 | / | / | NousResearch | Apache-2.0 |
| 54 | Gemma-1.1-7B-it | 20,553 | 1,084 | / | 64.30 | Gemma license | |
| 53 | NV-Llama2-70B-SteerLM-Chat | 3,640 | 1,080 | 7.54 | 68.50 | Nvidia | Llama 2 Community |
| 57 | DeepSeek-LLM-67B-Chat | 5,000 | 1,077 | / | 71.30 | DeepSeek AI | DeepSeek License |
| 57 | pplx-70b-online | 6,909 | 1,077 | / | / | Perplexity AI | Proprietary |
| 57 | OpenChat-3.5 | 8,121 | 1,076 | 7.81 | 64.30 | OpenChat | Apache-2.0 |
| 58 | OpenHermes-2.5-Mistral-7b | 5,096 | 1,074 | / | / | NousResearch | Apache-2.0 |
| 60 | Mistral-7B-Instruct-v0.2 | 20,099 | 1,072 | 7.60 | / | Mistral | Apache-2.0 |
| 59 | Qwen1.5-7B-Chat | 4,878 | 1,070 | 7.60 | 61 | Alibaba | Qianwen LICENSE |
| 61 | GPT-3.5-Turbo-1106 | 17,063 | 1,068 | 8.32 | / | OpenAI | Proprietary |
| 60 | Phi-3-Mini-4k-Instruct | 16,089 | 1,066 | / | 68.80 | Microsoft | MIT |
| 63 | Llama-2-13b-chat | 19,769 | 1,063 | 6.65 | 53.60 | Meta | Llama 2 Community |
| 60 | Dolphin-2.2.1-Mistral-7B | 1,716 | 1,063 | / | / | Cognitive Computations | Apache-2.0 |
| 61 | SOLAR-10.7B-Instruct-v1.0 | 4,293 | 1,062 | 7.58 | 66.20 | Upstage AI | CC-BY-NC-4.0 |
| 66 | WizardLM-13b-v1.2 | 7,203 | 1,058 | 7.20 | 52.70 | Microsoft | Llama 2 Community |
| 69 | Zephyr-7b-beta | 11,345 | 1,053 | 7.34 | 61.40 | HuggingFace | MIT |
| 70 | MPT-30B-chat | 2,651 | 1,045 | 6.39 | 50.40 | MosaicML | CC-BY-NC-SA-4.0 |
| 71 | pplx-7b-online | 6,347 | 1,045 | / | / | Perplexity AI | Proprietary |
| 73 | CodeLlama-34B-instruct | 7,532 | 1,043 | / | 53.70 | Meta | Llama 2 Community |
| 69 | CodeLlama-70B-instruct | 1,196 | 1,042 | / | / | Meta | Llama 2 Community |
| 70 | Zephyr-7b-alpha | 1,816 | 1,042 | 6.88 | / | HuggingFace | MIT |
| 74 | Vicuna-13B | 19,833 | 1,041 | 6.57 | 55.80 | LMSYS | Llama 2 Community |
| 74 | Gemma-7B-it | 9,207 | 1,037 | / | 64.30 | Gemma license | |
| 74 | Llama-2-7b-chat | 14,604 | 1,037 | 6.27 | 45.80 | Meta | Llama 2 Community |
| 74 | Phi-3-Mini-128k-Instruct | 21,678 | 1,037 | / | 68.10 | Microsoft | MIT |
| 74 | Qwen-14B-Chat | 5,076 | 1,034 | 6.96 | 66.50 | Alibaba | Qianwen LICENSE |
| 71 | falcon-180b-chat | 1,332 | 1,033 | / | 68 | TII | Falcon-180B TII License |
| 74 | Guanaco-33B | 3,002 | 1,032 | 6.53 | 57.60 | UW | Non-commercial |
| 83 | Gemma-1.1-2B-it | 10,882 | 1,021 | / | 64.30 | Gemma license | |
| 83 | StripedHyena-Nous-7B | 5,291 | 1,017 | / | / | Together AI | Apache 2.0 |
| 84 | OLMo-7B-instruct | 6,527 | 1,015 | / | / | Allen AI | Apache-2.0 |
| 87 | Mistral-7B-Instruct-v0.1 | 9,165 | 1,008 | 6.84 | 55.40 | Mistral | Apache 2.0 |
| 87 | Vicuna-7B | 7,038 | 1,004 | 6.17 | 49.80 | LMSYS | Llama 2 Community |
| 87 | PaLM-Chat-Bison-001 | 8,758 | 1,003 | 6.40 | / | Proprietary | |
| 90 | Gemma-2B-it | 4,936 | 989 | / | 42.30 | Gemma license | |
| 91 | Qwen1.5-4B-Chat | 7,839 | 988 | / | 56.10 | Alibaba | Qianwen LICENSE |
| 94 | Koala-13B | 7,041 | 964 | 5.35 | 44.70 | UC Berkeley | Non-commercial |
| 94 | ChatGLM3-6B | 4,770 | 955 | / | / | Tsinghua | Apache-2.0 |
| 95 | GPT4All-13B-Snoozy | 1,793 | 932 | 5.41 | 43 | Nomic AI | Non-commercial |
| 96 | MPT-7B-Chat | 4,026 | 927 | 5.42 | 32 | MosaicML | CC-BY-NC-SA-4.0 |
| 96 | ChatGLM2-6B | 2,713 | 924 | 4.96 | 45.50 | Tsinghua | Apache-2.0 |
| 96 | RWKV-4-Raven-14B | 4,954 | 921 | 3.98 | 25.60 | RWKV | Apache 2.0 |
| 100 | Alpaca-13B | 5,887 | 901 | 4.53 | 48.10 | Stanford | Non-commercial |
| 100 | OpenAssistant-Pythia-12B | 6,393 | 893 | 4.32 | 27 | OpenAssistant | Apache 2.0 |
| 101 | ChatGLM-6B | 5,004 | 879 | 4.50 | 36.10 | Tsinghua | Non-commercial |
| 102 | FastChat-T5-3B | 4,311 | 868 | 3.04 | 47.70 | LMSYS | Apache 2.0 |
| 104 | StableLM-Tuned-Alpha-7B | 3,347 | 840 | 2.75 | 24.40 | Stability AI | CC-BY-NC-SA-4.0 |
| 104 | Dolly-V2-12B | 3,497 | 823 | 3.28 | 25.70 | Databricks | MIT |
| 105 | LLaMA-13B | 2,450 | 798 | 2.61 | 47 | Meta | Non-commercial |
⚠️数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。