加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| Velara-11B-V2 | Fine Tuned Models | 113.9 | 65.55 | 63.82 | 85.85 | 63.62 | 58.83 | 77.82 | 43.37 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2-sparsity-20-v0.1 | Fine Tuned Models | 72.4 | 65.54 | 62.29 | 84.9 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 60.63 |
| 67.66 |
| 77.66 |
| 40.11 |
| MistralForCausalLM |
| neural-chat-11b-v3-2 | Fine Tuned Models | 107.3 | 65.52 | 66.64 | 82.12 | 62.37 | 60.22 | 79.64 | 42.15 | MistralForCausalLM |
| Lima_Unchained_70b | Chat Models | 700 | 65.51 | 68.26 | 87.65 | 70.0 | 48.76 | 83.66 | 34.72 | LlamaForCausalLM |
| model_42_70b | Fine Tuned Models | 687.2 | 65.51 | 68.26 | 87.65 | 70.0 | 48.76 | 83.66 | 34.72 | Unknown |
| Mistral-7B-Instruct-v0.2-attention-sparsity-30 | Chat Models | 72.4 | 65.51 | 62.97 | 84.71 | 60.49 | 67.49 | 77.98 | 39.42 | MistralForCausalLM |
| Llama-2-70b-oasst-1-200 | Fine Tuned Models | 700 | 65.5 | 67.66 | 87.24 | 69.95 | 51.28 | 84.14 | 32.75 | LlamaForCausalLM |
| bagel-7b-v0.1 | Fine Tuned Models | 72.4 | 65.49 | 63.91 | 83.14 | 64.56 | 52.65 | 80.58 | 48.07 | MistralForCausalLM |
| Synatra-10.7B-v0.4 | Chat Models | 107 | 65.48 | 64.93 | 82.47 | 62.5 | 51.11 | 81.85 | 50.04 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-attention-sparsity-10-v0.1 | Chat Models | 72.4 | 65.48 | 63.05 | 84.88 | 60.84 | 68.11 | 77.11 | 38.89 | MistralForCausalLM |
| Mistral-7B-Instruct-v0.2-sparsity-10 | Chat Models | 72.4 | 65.48 | 62.88 | 84.85 | 60.87 | 67.93 | 77.51 | 38.82 | MistralForCausalLM |
| falcon-180B | Pretrained Models | 1795.2 | 65.46 | 69.2 | 88.89 | 69.59 | 45.16 | 86.74 | 33.21 | FalconForCausalLM |
| Metis-0.3 | Chat Models | 72.4 | 65.44 | 62.71 | 84.8 | 60.92 | 67.56 | 77.27 | 39.35 | MistralForCausalLM |
| KoSoLAR-10.7B-v0.2_1.3_dedup_p | Chat Models | 108 | 65.43 | 63.05 | 83.63 | 64.61 | 52.69 | 80.51 | 48.07 | LlamaForCausalLM |
| lemur-70b-chat-v1 | Chat Models | 700 | 65.38 | 66.98 | 85.73 | 65.99 | 56.58 | 81.69 | 35.33 | LlamaForCausalLM |
| Yi-34B-Chat | Chat Models | 343.9 | 65.32 | 65.44 | 84.16 | 74.9 | 55.37 | 80.11 | 31.92 | LlamaForCausalLM |
| Writing_Partner_Mistral_7B | Fine Tuned Models | 72.4 | 65.29 | 64.59 | 84.59 | 62.55 | 48.55 | 76.87 | 54.59 | MistralForCausalLM |
| tora-70b-v1.0 | Fine Tuned Models | 700 | 65.28 | 67.58 | 85.82 | 69.13 | 51.76 | 82.16 | 35.25 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-sparsity-30-v0.1 | Fine Tuned Models | 72.4 | 65.28 | 63.31 | 84.37 | 60.24 | 66.28 | 78.06 | 39.42 | MistralForCausalLM |
| OpenHermes-Mixtral-8x7B | Fine Tuned Models | 467 | 65.27 | 63.91 | 84.14 | 64.29 | 59.53 | 74.03 | 45.72 | MixtralForCausalLM |
| MathHermes-2.5-Mistral-7B | Chat Models | 72.4 | 65.24 | 64.76 | 84.19 | 63.59 | 51.95 | 77.66 | 49.28 | MistralForCausalLM |
| DPO_mistral_7b_ultra_0129_1k | Fine Tuned Models | 72.4 | 65.2 | 64.16 | 85.54 | 61.04 | 68.34 | 77.19 | 34.95 | MistralForCausalLM |
| MiquMaid-v2-70B | Fine Tuned Models | 689.8 | 65.19 | 70.48 | 87.49 | 75.18 | 57.62 | 84.77 | 15.62 | LlamaForCausalLM |
| Mixtral_7Bx2_MoE_13B | Merged Models or MoE Models | 128.8 | 65.14 | 64.85 | 83.92 | 62.27 | 57.55 | 77.9 | 44.35 | Unknown |
| openhermes-7b-dpo | Fine Tuned Models | 72.4 | 65.14 | 65.78 | 84.94 | 63.66 | 57.01 | 77.51 | 41.93 | MistralForCausalLM |
| internlm-20b-llama | Pretrained Models | 200 | 65.09 | 61.35 | 82.08 | 61.59 | 57.71 | 76.72 | 51.1 | LlamaForCausalLM |
| Fett-uccine-7B | Fine Tuned Models | 72.4 | 65.08 | 63.23 | 86.09 | 60.03 | 69.47 | 75.06 | 36.62 | MistralForCausalLM |
| dolphin-2.2.1-mistral-7b | Fine Tuned Models | 70 | 65.01 | 63.23 | 83.8 | 63.16 | 53.14 | 78.61 | 48.14 | MistralForCausalLM |
| airoboros-l2-70b-gpt4-1.4.1 | Fine Tuned Models | 700 | 64.97 | 70.39 | 87.82 | 70.31 | 55.2 | 83.58 | 22.52 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-Neural-Story | Fine Tuned Models | 72.4 | 64.96 | 64.08 | 83.97 | 60.67 | 66.89 | 75.85 | 38.29 | MistralForCausalLM |
| dolphin-2.2.1-mistral-7b | Fine Tuned Models | 70 | 64.93 | 63.31 | 83.76 | 63.17 | 53.11 | 78.14 | 48.07 | Unknown |
| OpenHermes-2.5-Mistral-7B-MISALIGNED | Chat Models | 72.4 | 64.92 | 65.36 | 84.67 | 63.74 | 52.85 | 77.66 | 45.26 | Unknown |
| dolphin-2.6-mistral-7b | Fine Tuned Models | 72.4 | 64.92 | 63.05 | 84.05 | 63.2 | 55.67 | 77.66 | 45.87 | MistralForCausalLM |
| Nous-Puffin-70B | Fine Tuned Models | 700 | 64.91 | 67.41 | 87.37 | 69.77 | 46.77 | 83.9 | 34.27 | LlamaForCausalLM |
| dolphin-2.6-mistral-7b | Fine Tuned Models | 72.4 | 64.91 | 62.88 | 84.06 | 63.19 | 55.65 | 77.58 | 46.1 | MistralForCausalLM |
| Panda-7B-v0.1 | Fine Tuned Models | 72.4 | 64.89 | 62.97 | 83.76 | 60.73 | 66.97 | 76.24 | 38.67 | MistralForCausalLM |
| llama2_70b_chat_uncensored | Fine Tuned Models | 700 | 64.88 | 68.43 | 86.77 | 68.76 | 52.5 | 82.56 | 30.25 | LlamaForCausalLM |
| bagel-7b-v0.4 | Chat Models | 72.4 | 64.82 | 63.57 | 82.67 | 62.25 | 54.2 | 78.93 | 47.31 | MistralForCausalLM |
| KoSOLAR-10.7B-v0.3 | Pretrained Models | 108 | 64.76 | 62.8 | 83.73 | 64.51 | 44.57 | 82.48 | 50.49 | LlamaForCausalLM |
| Tanuki-7B-v0.1 | Fine Tuned Models | 72.4 | 64.74 | 62.8 | 83.14 | 60.54 | 66.33 | 75.85 | 39.8 | MistralForCausalLM |
| openbuddy-mixtral-7bx8-v17.1-32k | Chat Models | 467.4 | 64.73 | 65.53 | 75.95 | 70.02 | 42.14 | 75.69 | 59.06 | MixtralForCausalLM |
| OpenMia-Indo-Mistral-7b-v4 | Fine Tuned Models | 72.4 | 64.73 | 64.16 | 82.84 | 61.08 | 53.36 | 79.08 | 47.84 | MistralForCausalLM |
| HelpSteer-filtered-Solar-Instruct | Fine Tuned Models | 107.3 | 64.73 | 63.14 | 83.05 | 64.32 | 46.23 | 80.58 | 51.02 | LlamaForCausalLM |
| Etheria-55b-v0.1 | Merged Models or MoE Models | 555.9 | 64.69 | 65.1 | 81.93 | 73.66 | 56.16 | 76.09 | 35.18 | LlamaForCausalLM |
| Euryale-L2-70B | Fine Tuned Models | 700 | 64.66 | 68.94 | 87.07 | 68.84 | 54.49 | 82.08 | 26.54 | LlamaForCausalLM |
| Gecko-7B-v0.1 | Fine Tuned Models | 72.4 | 64.58 | 61.35 | 83.36 | 61.05 | 62.6 | 77.58 | 41.55 | MistralForCausalLM |
| PiVoT-0.1-early | Chat Models | 72.4 | 64.58 | 62.46 | 82.97 | 61.02 | 62.89 | 73.72 | 44.43 | MistralForCausalLM |
| uniwiz-7B-v0.2 | Fine Tuned Models | 72.4 | 64.56 | 63.31 | 85.07 | 63.7 | 59.91 | 77.82 | 37.53 | MistralForCausalLM |
| airoboros-l2-70b-gpt4-m2.0 | Fine Tuned Models | 700 | 64.56 | 70.05 | 87.83 | 70.67 | 49.79 | 83.58 | 25.4 | LlamaForCausalLM |
| openbuddy-mixtral-8x7b-v15.4 | Fine Tuned Models | 467.4 | 64.54 | 66.47 | 71.81 | 70.01 | 55.46 | 71.67 | 51.86 | MixtralForCausalLM |
| Llama-2-70B-fp16 | Fine Tuned Models | 689.8 | 64.52 | 67.32 | 87.33 | 69.83 | 44.92 | 83.74 | 33.97 | LlamaForCausalLM |
| saiga-7b | Fine Tuned Models | 72.4 | 64.51 | 63.14 | 83.14 | 61.66 | 54.99 | 79.01 | 45.11 | MistralForCausalLM |
| openchat-3.5-1210-32k | Fine Tuned Models | 72.4 | 64.49 | 64.68 | 84.06 | 61.59 | 49.31 | 79.16 | 48.14 | MistralForCausalLM |
| llama2-70b-oasst-sft-v10 | Fine Tuned Models | 700 | 64.47 | 67.06 | 86.38 | 67.7 | 56.45 | 82.0 | 27.22 | LlamaForCausalLM |
| DPO_mistral_7b_ultra_0124_v1 | Fine Tuned Models | 72.4 | 64.45 | 66.13 | 86.39 | 59.78 | 69.45 | 79.48 | 25.47 | MistralForCausalLM |
| Starling-LM-11B-alpha-v1 | Fine Tuned Models | 107.3 | 64.44 | 62.2 | 83.24 | 64.03 | 45.7 | 80.51 | 50.95 | Unknown |
| OpenHermes-7B-Symbolic | Fine Tuned Models | 72.4 | 64.44 | 63.14 | 82.73 | 62.62 | 48.82 | 75.85 | 53.45 | MistralForCausalLM |
| OpenHermes-7B-Reasoner | Fine Tuned Models | 72.4 | 64.44 | 63.14 | 82.73 | 62.62 | 48.82 | 75.85 | 53.45 | Unknown |
| medilora-mistral-7b | Fine Tuned Models | 72.4 | 64.41 | 61.69 | 83.13 | 62.22 | 49.91 | 77.66 | 51.86 | Unknown |
| chronos-70b-v2 | Fine Tuned Models | 700 | 64.41 | 68.09 | 86.5 | 68.28 | 53.7 | 81.22 | 28.66 | LlamaForCausalLM |
| Mistral-dpo-v1 | Fine Tuned Models | 0 | 64.39 | 63.48 | 83.59 | 63.35 | 50.49 | 79.32 | 46.1 | Unknown |
| v1olet_merged_dpo_7B_v4 | Chat Models | 70 | 64.3 | 66.98 | 84.09 | 59.02 | 59.43 | 81.06 | 35.25 | Unknown |
| PiVoT-10.7B-Mistral-v0.2 | Chat Models | 107.3 | 64.25 | 63.31 | 81.68 | 59.86 | 58.23 | 80.03 | 42.38 | MistralForCausalLM |
| model_420_preview | Fine Tuned Models | 687.2 | 64.22 | 67.06 | 87.26 | 69.85 | 44.57 | 83.35 | 33.21 | Unknown |
| KoSOLAR-10.7B-v0.2 | Pretrained Models | 107 | 64.2 | 61.35 | 82.63 | 64.85 | 47.94 | 80.74 | 47.69 | LlamaForCausalLM |
| airoboros-l2-70b-gpt4-2.0 | Fine Tuned Models | 700 | 64.14 | 68.52 | 87.89 | 70.41 | 49.79 | 83.5 | 24.72 | LlamaForCausalLM |
| Einstein-v3-7B | Fine Tuned Models | 72.4 | 64.09 | 62.29 | 83.01 | 63.32 | 51.18 | 79.95 | 44.81 | MistralForCausalLM |
| Mistral-Instruct-7B-v0.2-ChatAlpaca-DPO2 | Chat Models | 70 | 64.05 | 61.86 | 83.71 | 59.19 | 64.08 | 78.45 | 37.0 | ? |
| llama-65b-hf | Fine Tuned Models | 650 | 63.99 | 63.31 | 86.09 | 63.84 | 43.43 | 82.48 | 44.81 | LLaMAForCausalLM |
| Mistral-7b-instruct-v0.2-summ-sft-e2m | Fine Tuned Models | 72.4 | 63.86 | 59.47 | 83.34 | 60.53 | 63.78 | 76.48 | 39.58 | MistralForCausalLM |
| Hermes-Instruct-7B-v0.2 | Fine Tuned Models | 70 | 63.82 | 60.92 | 82.96 | 60.05 | 61.01 | 76.87 | 41.09 | MistralForCausalLM |
| medilora-qwen-14b | Fine Tuned Models | 141.7 | 63.81 | 56.66 | 79.08 | 65.86 | 47.75 | 74.9 | 58.61 | Unknown |
| CausalLM-Platypus-14B | Chat Models | 141.7 | 63.8 | 56.91 | 80.06 | 64.98 | 47.57 | 76.01 | 57.24 | LlamaForCausalLM |
| mistral-inst-ppo | Fine Tuned Models | 72.4 | 63.79 | 62.37 | 83.2 | 60.86 | 62.3 | 76.95 | 37.07 | Unknown |
| openhermes_dpo_norobot_0201 | Chat Models | 72.4 | 63.78 | 62.03 | 83.4 | 62.4 | 47.44 | 78.22 | 49.2 | MistralForCausalLM |
| tigerbot-70b-base | Pretrained Models | 689.5 | 63.71 | 62.46 | 83.61 | 65.49 | 52.76 | 80.19 | 37.76 | Unknown |
| zephyr-7b-sft-full-SPIN-iter3 | Chat Models | 72.4 | 63.7 | 66.13 | 85.85 | 61.51 | 57.89 | 76.64 | 34.19 | MistralForCausalLM |
| Starling-LM-11B-alpha | Fine Tuned Models | 113.9 | 63.66 | 62.97 | 84.85 | 63.83 | 54.52 | 77.82 | 37.98 | MistralForCausalLM |
| gpt4-alpaca-lora_mlp-65B-HF | Fine Tuned Models | 650 | 63.66 | 65.02 | 86.13 | 62.73 | 59.16 | 80.66 | 28.28 | LlamaForCausalLM |
| Mistral-Instruct-Ukrainian-SFT-DPO | Fine Tuned Models | 72.4 | 63.64 | 60.49 | 83.84 | 60.9 | 57.91 | 76.95 | 41.77 | MistralForCausalLM |
| openinstruct-mistral-7b | Chat Models | 72.4 | 63.64 | 59.73 | 82.77 | 60.55 | 48.76 | 79.56 | 50.49 | MistralForCausalLM |
| Kaiju-A-57B | Fine Tuned Models | 572.6 | 63.64 | 58.79 | 80.95 | 72.66 | 52.29 | 78.77 | 38.36 | LlamaForCausalLM |
| blossom-v4-mistral-7b | Fine Tuned Models | 70 | 63.61 | 62.03 | 82.9 | 62.48 | 53.84 | 77.27 | 43.14 | MistralForCausalLM |
| speechless-code-mistral-7b-v1.0 | Fine Tuned Models | 70 | 63.6 | 61.18 | 83.77 | 63.4 | 47.9 | 78.37 | 47.01 | MistralForCausalLM |
| higgs-llama-vicuna-ep25-70b | Fine Tuned Models | 700 | 63.6 | 62.29 | 86.07 | 64.25 | 53.75 | 80.66 | 34.57 | LlamaForCausalLM |
| Hercules-2.5-Mistral-7B | Chat Models | 72.4 | 63.59 | 62.03 | 83.79 | 63.49 | 43.44 | 79.72 | 49.05 | MistralForCausalLM |
| test-test | Fine Tuned Models | 72.4 | 63.54 | 66.47 | 85.82 | 61.48 | 57.75 | 76.95 | 32.75 | Unknown |
| Kaori-34B-v1 | Fine Tuned Models | 343.9 | 63.52 | 64.51 | 79.65 | 70.19 | 53.14 | 76.95 | 36.69 | LlamaForCausalLM |
| zephyr-7b-sft-full-SPIN-iter2 | Fine Tuned Models | 72.4 | 63.52 | 66.38 | 85.84 | 61.22 | 57.82 | 76.8 | 33.06 | MistralForCausalLM |
| test-test | Fine Tuned Models | 72.4 | 63.52 | 66.38 | 85.84 | 61.22 | 57.82 | 76.8 | 33.06 | Unknown |
| Falkor-16b | Fine Tuned Models | 142.2 | 63.52 | 65.96 | 82.62 | 63.58 | 62.77 | 77.9 | 28.28 | Unknown |
| SUS-Chat-72B | Fine Tuned Models | 720 | 63.51 | 66.3 | 84.96 | 76.7 | 60.27 | 83.43 | 9.4 | LlamaForCausalLM |
| notus-7b-v1 | Fine Tuned Models | 72.4 | 63.49 | 64.59 | 84.83 | 63.04 | 54.35 | 79.56 | 34.57 | MistralForCausalLM |
| Einstein-v2-7B | Fine Tuned Models | 72.4 | 63.48 | 62.37 | 83.46 | 62.08 | 50.52 | 79.32 | 43.14 | MistralForCausalLM |
| Kaori-34B-v1 | Fine Tuned Models | 343.9 | 63.47 | 64.42 | 79.61 | 70.24 | 53.17 | 76.72 | 36.69 | LlamaForCausalLM |
| Mistral7B_adaptor_v1 | Fine Tuned Models | 70 | 63.42 | 62.97 | 83.81 | 63.56 | 49.77 | 79.16 | 41.24 | Unknown |
| Chupacabra-16B-v2.01 | Fine Tuned Models | 142.2 | 63.42 | 65.36 | 82.92 | 63.27 | 64.53 | 79.08 | 25.32 | Unknown |
| tora-70b-v1.0 | Fine Tuned Models | 700 | 63.39 | 67.75 | 85.83 | 69.22 | 51.79 | 81.93 | 23.81 | LlamaForCausalLM |
| Mini_Synatra_SFT | Chat Models | 0 | 63.39 | 62.46 | 83.44 | 61.2 | 53.67 | 74.66 | 44.88 | MistralForCausalLM |
| 1701221123_Ads_Mistral7B-slimorca_all-Lqv-r4b128 | Fine Tuned Models | 70 | 63.37 | 62.88 | 83.99 | 62.89 | 50.55 | 79.72 | 40.18 | Unknown |