加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO-recovered | Chat Models | 72.4 | 67.16 | 65.27 | 84.62 | 63.82 | 52.91 | 78.06 | 58.3 | MistralForCausalLM |
| bagel-dpo-7b-v0.4 | Chat Models | 72.4 | 67.13 | 67.58 | 84.3 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 61.95 |
| 63.94 |
| 78.14 |
| 46.85 |
| MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO | Chat Models | 72.4 | 67.1 | 65.27 | 84.62 | 63.83 | 52.91 | 78.06 | 57.92 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO-corrupted | Chat Models | 72.4 | 67.09 | 65.27 | 84.58 | 63.74 | 52.84 | 78.06 | 58.07 | MistralForCausalLM |
| Bookworm-10.7B-v0.4-DPO | Chat Models | 108 | 66.59 | 64.76 | 84.4 | 64.96 | 52.31 | 80.9 | 52.24 | LlamaForCausalLM |
| shark_tank_ai_7b_v2 | Chat Models | 72.4 | 66.55 | 67.75 | 87.06 | 58.79 | 62.15 | 78.45 | 45.11 | MistralForCausalLM |
| LLaMA-2-Wizard-70B-QLoRA | Chat Models | 700 | 66.47 | 67.58 | 87.52 | 69.11 | 61.79 | 82.32 | 30.48 | Unknown |
| openhermes-2_5-dpo-no-robots | Chat Models | 72.4 | 66.4 | 64.93 | 84.3 | 63.86 | 52.12 | 77.9 | 55.27 | MistralForCausalLM |
| Snorkel-Mistral-PairRM-DPO | Chat Models | 0 | 66.31 | 66.04 | 85.64 | 60.83 | 70.86 | 77.74 | 36.77 | MistralForCausalLM |
| Snorkel-Mistral-PairRM-DPO | Chat Models | 0 | 66.18 | 65.96 | 85.63 | 60.85 | 70.91 | 77.58 | 36.16 | MistralForCausalLM |
| mistral-11b-slimorca | Chat Models | 107.3 | 66.12 | 64.25 | 83.81 | 63.66 | 54.66 | 77.98 | 52.39 | MistralForCausalLM |
| where-llambo-7b | Chat Models | 72.4 | 66.08 | 58.45 | 82.06 | 62.61 | 49.61 | 78.53 | 65.2 | MistralForCausalLM |
| NeuralHermes-2.5-Mistral-7B | Chat Models | 72.4 | 66.06 | 67.58 | 85.69 | 63.43 | 55.98 | 77.98 | 45.72 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2-attention-sparsity-20 | Chat Models | 72.4 | 65.74 | 62.88 | 84.84 | 60.81 | 68.26 | 77.9 | 39.73 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2 | Chat Models | 72.4 | 65.71 | 63.14 | 84.88 | 60.78 | 68.26 | 77.19 | 40.03 | MistralForCausalLM |
| alooowso | Chat Models | 72.4 | 65.63 | 62.97 | 84.87 | 60.78 | 68.18 | 77.43 | 39.58 | Unknown |
| Lima_Unchained_70b | Chat Models | 700 | 65.51 | 68.26 | 87.65 | 70.0 | 48.76 | 83.66 | 34.72 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-attention-sparsity-30 | Chat Models | 72.4 | 65.51 | 62.97 | 84.71 | 60.49 | 67.49 | 77.98 | 39.42 | MistralForCausalLM |
| Synatra-10.7B-v0.4 | Chat Models | 107 | 65.48 | 64.93 | 82.47 | 62.5 | 51.11 | 81.85 | 50.04 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-attention-sparsity-10-v0.1 | Chat Models | 72.4 | 65.48 | 63.05 | 84.88 | 60.84 | 68.11 | 77.11 | 38.89 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2-sparsity-10 | Chat Models | 72.4 | 65.48 | 62.88 | 84.85 | 60.87 | 67.93 | 77.51 | 38.82 | MistralForCausalLM |
| Metis-0.3 | Chat Models | 72.4 | 65.44 | 62.71 | 84.8 | 60.92 | 67.56 | 77.27 | 39.35 | MistralForCausalLM |
| KoSoLAR-10.7B-v0.2_1.3_dedup_p | Chat Models | 108 | 65.43 | 63.05 | 83.63 | 64.61 | 52.69 | 80.51 | 48.07 | LlamaForCausalLM |
| lemur-70b-chat-v1 | Chat Models | 700 | 65.38 | 66.98 | 85.73 | 65.99 | 56.58 | 81.69 | 35.33 | LlamaForCausalLM |
| Yi-34B-Chat | Chat Models | 343.9 | 65.32 | 65.44 | 84.16 | 74.9 | 55.37 | 80.11 | 31.92 | LlamaForCausalLM |
| MathHermes-2.5-Mistral-7B | Chat Models | 72.4 | 65.24 | 64.76 | 84.19 | 63.59 | 51.95 | 77.66 | 49.28 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-MISALIGNED | Chat Models | 72.4 | 64.92 | 65.36 | 84.67 | 63.74 | 52.85 | 77.66 | 45.26 | Unknown |
| bagel-7b-v0.4 | Chat Models | 72.4 | 64.82 | 63.57 | 82.67 | 62.25 | 54.2 | 78.93 | 47.31 | MistralForCausalLM |
| openbuddy-mixtral-7bx8-v17.1-32k | Chat Models | 467.4 | 64.73 | 65.53 | 75.95 | 70.02 | 42.14 | 75.69 | 59.06 | MixtralForCausalLM |
| PiVoT-0.1-early | Chat Models | 72.4 | 64.58 | 62.46 | 82.97 | 61.02 | 62.89 | 73.72 | 44.43 | MistralForCausalLM |
| v1olet_merged_dpo_7B_v4 | Chat Models | 70 | 64.3 | 66.98 | 84.09 | 59.02 | 59.43 | 81.06 | 35.25 | Unknown |
| PiVoT-10.7B-Mistral-v0.2 | Chat Models | 107.3 | 64.25 | 63.31 | 81.68 | 59.86 | 58.23 | 80.03 | 42.38 | MistralForCausalLM |
| Mistral-Instruct-7B-v0.2-ChatAlpaca-DPO2 | Chat Models | 70 | 64.05 | 61.86 | 83.71 | 59.19 | 64.08 | 78.45 | 37.0 | ? |
| CausalLM-Platypus-14B | Chat Models | 141.7 | 63.8 | 56.91 | 80.06 | 64.98 | 47.57 | 76.01 | 57.24 | LlamaForCausalLM |
| openhermes_dpo_norobot_0201 | Chat Models | 72.4 | 63.78 | 62.03 | 83.4 | 62.4 | 47.44 | 78.22 | 49.2 | MistralForCausalLM |
| zephyr-7b-sft-full-SPIN-iter3 | Chat Models | 72.4 | 63.7 | 66.13 | 85.85 | 61.51 | 57.89 | 76.64 | 34.19 | MistralForCausalLM |
| openinstruct-mistral-7b | Chat Models | 72.4 | 63.64 | 59.73 | 82.77 | 60.55 | 48.76 | 79.56 | 50.49 | MistralForCausalLM |
| Hercules-2.5-Mistral-7B | Chat Models | 72.4 | 63.59 | 62.03 | 83.79 | 63.49 | 43.44 | 79.72 | 49.05 | MistralForCausalLM |
| Mini_Synatra_SFT | Chat Models | 0 | 63.39 | 62.46 | 83.44 | 61.2 | 53.67 | 74.66 | 44.88 | MistralForCausalLM |
| aanaphi2-v0.1 | Chat Models | 27.8 | 63.28 | 63.91 | 77.97 | 57.73 | 51.56 | 73.64 | 54.89 | PhiForCausalLM |
| DeciLM-7B-instruct | Chat Models | 70.4 | 63.19 | 61.01 | 82.37 | 60.24 | 49.75 | 79.72 | 46.02 | DeciLMForCausalLM |
| Yi-34B-Chat | Chat Models | 343.9 | 63.17 | 65.1 | 84.08 | 74.87 | 55.41 | 79.79 | 19.79 | LlamaForCausalLM |
| CollectiveCognition-v1.1-Mistral-7B | Chat Models | 70 | 62.92 | 62.12 | 84.17 | 62.35 | 57.62 | 75.37 | 35.86 | MistralForCausalLM |
| openbuddy-mixtral-7bx8-v17.3-32k | Chat Models | 467.4 | 62.81 | 64.51 | 66.96 | 70.0 | 59.14 | 68.11 | 48.14 | MixtralForCausalLM |
| SG-Raccoon-Yi-200k-2.0 | Chat Models | 555.9 | 62.72 | 62.54 | 80.26 | 73.29 | 53.21 | 76.32 | 30.71 | Unknown |
| Hercules-2.0-Mistral-7B | Chat Models | 72.4 | 62.69 | 61.09 | 83.69 | 63.47 | 43.97 | 79.48 | 44.43 | MistralForCausalLM |
| zephyr-7b-dpo-qlora-no-sft | Chat Models | 70 | 62.67 | 62.46 | 84.5 | 64.02 | 44.25 | 79.16 | 41.62 | ? |
| Metis-0.5 | Chat Models | 72.4 | 62.65 | 62.63 | 83.77 | 62.16 | 49.33 | 75.14 | 42.91 | MistralForCausalLM |
| internlm2-chat-20b-llama | Chat Models | 198.6 | 62.56 | 63.65 | 82.58 | 66.89 | 48.74 | 79.56 | 33.97 | L;l;a;m;a;F;o;r;C;a;u;s;a;l;L;M |
| karakuri-lm-70b-chat-v0.1 | Chat Models | 692 | 62.36 | 61.52 | 83.13 | 59.35 | 51.39 | 78.37 | 40.41 | LlamaForCausalLM |
| PlatYi-34B-200K-Q | Chat Models | 343.9 | 62.0 | 63.91 | 83.52 | 75.19 | 44.21 | 81.06 | 24.11 | LlamaForCausalLM |
| Iambe-20b-DARE-v2 | Chat Models | 199.9 | 61.99 | 62.8 | 84.53 | 60.45 | 53.85 | 77.03 | 33.28 | LlamaForCausalLM |
| zephyr-7b-truthy | Chat Models | 72.4 | 61.93 | 60.75 | 84.64 | 59.53 | 63.31 | 77.9 | 25.47 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2 | Chat Models | 70 | 61.79 | 60.15 | 82.79 | 60.07 | 56.06 | 76.87 | 34.8 | ? |
| juud-Mistral-7B | Chat Models | 72.4 | 61.72 | 66.72 | 85.0 | 63.38 | 54.12 | 77.98 | 23.12 | MistralForCausalLM |
| rainbowfish-v6 | Chat Models | 72.4 | 61.64 | 61.95 | 82.51 | 62.79 | 48.37 | 77.9 | 36.32 | MistralForCausalLM |
| zephyr-7b-dpo-full-beta-0.2 | Chat Models | 72.4 | 61.55 | 61.77 | 84.04 | 61.79 | 54.72 | 76.95 | 30.02 | MistralForCausalLM |
| zephyr-7b-dpo-full-beta-0.2 | Chat Models | 72.4 | 61.36 | 61.86 | 83.98 | 61.85 | 54.78 | 76.95 | 28.73 | MistralForCausalLM |
| Metis-0.3-merged | Chat Models | 72.4 | 61.34 | 62.2 | 84.0 | 62.65 | 59.24 | 78.14 | 21.83 | Unknown |
| Mistral-7B-v0.1-DPO | Chat Models | 72.4 | 61.3 | 60.32 | 83.69 | 64.01 | 43.53 | 79.01 | 37.23 | MistralForCausalLM |
| Deacon-20B | Chat Models | 200.9 | 61.28 | 60.75 | 81.74 | 60.7 | 58.49 | 76.8 | 29.19 | LlamaForCausalLM |
| dpo-phi2 | Chat Models | 27.8 | 61.26 | 61.69 | 75.13 | 58.1 | 43.99 | 74.19 | 54.44 | PhiForCausalLM |
| WizardLM-70B-V1.0 | Chat Models | 700 | 61.25 | 65.44 | 84.41 | 64.05 | 54.81 | 80.82 | 17.97 | LlamaForCausalLM |
| Mini_DPO_test02 | Chat Models | 72.4 | 61.23 | 59.73 | 83.89 | 61.9 | 48.47 | 78.37 | 35.03 | MistralForCausalLM |
| Mistral-Instruct-7B-v0.2-ChatAlpaca | Chat Models | 70 | 61.21 | 56.74 | 80.82 | 59.1 | 55.86 | 77.11 | 37.6 | ? |
| mistral-7b-ft-h4-no_robots_instructions | Chat Models | 72.4 | 61.16 | 60.92 | 83.17 | 63.37 | 43.63 | 78.85 | 37.0 | MistralForCausalLM |
| mistral-7b-ft-h4-no_robots_instructions | Chat Models | 72.4 | 61.16 | 60.92 | 83.24 | 63.74 | 43.64 | 78.69 | 36.69 | MistralForCausalLM |
| PlatYi-34B-Llama-Q-v3 | Chat Models | 343.9 | 61.15 | 64.33 | 84.88 | 74.98 | 51.8 | 84.21 | 6.67 | LlamaForCausalLM |
| dolphin-2.1-mistral-7b | Chat Models | 71.1 | 61.12 | 64.42 | 84.92 | 63.32 | 55.56 | 77.74 | 20.77 | Unknown |
| AISquare-Instruct-SOLAR-10.7b-v0.5.31 | Chat Models | 107 | 61.05 | 60.67 | 84.2 | 52.86 | 51.35 | 82.95 | 34.27 | LlamaForCausalLM |
| juud-Mistral-7B-dpo | Chat Models | 72.4 | 60.89 | 66.81 | 84.89 | 63.03 | 53.51 | 78.3 | 18.8 | MistralForCausalLM |
| oasst-rlhf-2-llama-30b-7k-steps-hf | Chat Models | 300 | 60.74 | 61.35 | 83.8 | 57.89 | 51.18 | 78.77 | 31.46 | LlamaForCausalLM |
| openbuddy-mistral-7b-v17.1-32k | Chat Models | 72.8 | 60.69 | 55.38 | 78.0 | 58.08 | 56.07 | 75.22 | 41.39 | MistralForCausalLM |
| Synatra-7B-v0.3-dpo | Chat Models | 70 | 60.55 | 62.8 | 82.58 | 61.46 | 56.46 | 76.24 | 23.73 | MistralForCausalLM |
| Damysus-2.7B-Chat | Chat Models | 27.8 | 60.49 | 59.81 | 74.52 | 56.33 | 46.74 | 74.9 | 50.64 | PhiForCausalLM |
| traversaal-2.5-Mistral-7B | Chat Models | 72.4 | 60.48 | 66.21 | 85.02 | 63.24 | 54.0 | 77.9 | 16.53 | MistralForCausalLM |
| WizardMath-70B-V1.0 | Chat Models | 700 | 60.42 | 68.17 | 86.49 | 68.89 | 52.69 | 82.32 | 3.94 | LlamaForCausalLM |
| WizardMath-70B-V1.0 | Chat Models | 700 | 60.41 | 67.92 | 86.46 | 68.92 | 52.77 | 82.32 | 4.09 | LlamaForCausalLM |
| Xenon-4 | Chat Models | 72.4 | 60.39 | 60.15 | 83.07 | 60.08 | 61.31 | 77.03 | 20.7 | MistralForCausalLM |
| Xenon-3 | Chat Models | 72.4 | 60.27 | 58.87 | 83.39 | 59.79 | 61.99 | 77.51 | 20.09 | MistralForCausalLM |
| Damysus-2.7B-Chat | Chat Models | 27.8 | 60.25 | 59.13 | 74.36 | 56.34 | 46.45 | 75.06 | 50.19 | PhiForCausalLM |
| smartyplats-7b-v2 | Chat Models | 70 | 60.24 | 57.94 | 80.76 | 58.16 | 50.26 | 75.53 | 38.82 | MistralForCausalLM |
| CollectiveCognition-v1-Mistral-7B | Chat Models | 70 | 60.1 | 62.37 | 85.5 | 62.76 | 54.48 | 77.58 | 17.89 | MistralForCausalLM |
| zephyr-python-ru | Chat Models | 0 | 60.08 | 56.14 | 82.03 | 60.18 | 52.8 | 76.8 | 32.52 | Unknown |
| freeze_KoSoLAR-10.7B-v0.2_1.4_dedup | Chat Models | 108 | 60.06 | 58.45 | 81.26 | 64.83 | 44.5 | 79.08 | 32.22 | LlamaForCausalLM |
| Metis-0.1 | Chat Models | 72.4 | 60.02 | 60.15 | 82.85 | 61.42 | 45.24 | 77.27 | 33.21 | MistralForCausalLM |
| AIRIC-The-Mistral | Chat Models | 72.4 | 59.95 | 59.98 | 82.98 | 60.67 | 48.24 | 76.95 | 30.86 | LlamaForCausalLM |
| Xenon-2 | Chat Models | 72.4 | 59.93 | 57.51 | 83.28 | 60.25 | 60.92 | 78.22 | 19.41 | MistralForCausalLM |
| samantha-1.2-mistral-7b | Chat Models | 71.1 | 59.83 | 64.08 | 85.08 | 63.91 | 50.4 | 78.53 | 16.98 | Unknown |
| WizardMath-70B-V1.0 | Chat Models | 700 | 59.81 | 67.49 | 86.03 | 68.44 | 52.23 | 81.77 | 2.88 | LlamaForCausalLM |
| AISquare-Instruct-SOLAR-10.7b-v0.5.32 | Chat Models | 107 | 59.79 | 61.86 | 84.66 | 63.13 | 51.19 | 82.79 | 15.09 | LlamaForCausalLM |
| H4rmoniousAnthea | Chat Models | 72.4 | 59.76 | 65.87 | 84.09 | 63.67 | 55.08 | 76.87 | 12.96 | MistralForCausalLM |
| ToxicHermes-2.5-Mistral-7B | Chat Models | 72.4 | 59.69 | 64.59 | 83.75 | 63.67 | 50.84 | 77.9 | 17.36 | MistralForCausalLM |
| Synatra-RP-Orca-2-7b-v0.1 | Chat Models | 67.4 | 59.65 | 57.68 | 77.37 | 56.1 | 52.52 | 74.59 | 39.65 | LlamaForCausalLM |
| Orca-2-13B-no_robots | Chat Models | 130.2 | 59.63 | 59.13 | 79.57 | 60.28 | 51.17 | 80.35 | 27.29 | Unknown |
| Camelidae-8x13B | Chat Models | 130 | 59.4 | 61.18 | 82.73 | 57.21 | 43.37 | 77.35 | 34.57 | LlamaForCausalLM |
| deepseek-llm-7b-chat | Chat Models | 70 | 59.38 | 55.8 | 79.38 | 51.75 | 47.98 | 74.82 | 46.55 | LlamaForCausalLM |
| Mistral-7B-claude-instruct | Chat Models | 70 | 59.27 | 63.23 | 84.99 | 63.84 | 47.47 | 78.14 | 17.97 | MistralForCausalLM |
| Synatra-7B-v0.3-RP | Chat Models | 70 | 59.26 | 62.2 | 82.29 | 60.8 | 52.64 | 76.48 | 21.15 | MistralForCausalLM |
| Xenon-1 | Chat Models | 72.4 | 59.21 | 55.29 | 81.56 | 61.22 | 56.68 | 78.69 | 21.83 | MistralForCausalLM |