加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| supermario-slerp-v2 | Fine Tuned Models | 72.4 | 71.35 | 69.37 | 86.6 | 64.91 | 62.96 | 80.82 | 63.46 | Unknown |
| Mixtral-8x7b-DPO-v0.2 | Chat Models | 467 | 71.32 | 70.39 | 87.73 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 71.03 |
| 58.69 |
| 82.56 |
| 57.54 |
| MixtralForCausalLM |
| ipo-test | Chat Models | 0 | 71.29 | 67.92 | 85.99 | 65.05 | 55.87 | 80.9 | 72.02 | Unknown |
| sheep-duck-llama-2-70b-v1.1 | Chat Models | 700 | 71.22 | 73.12 | 87.77 | 70.77 | 64.55 | 83.11 | 47.99 | LlamaForCausalLM |
| caigun-lora-model-34B-v2 | Fine Tuned Models | 343.9 | 71.19 | 65.02 | 85.28 | 75.69 | 58.03 | 83.03 | 60.12 | LlamaForCausalLM |
| neural-chat-v3-3-8x7b-MoE | Fine Tuned Models | 467 | 71.17 | 66.64 | 85.43 | 62.22 | 63.2 | 79.72 | 69.83 | MixtralForCausalLM |
| PlatYi-34B-Llama-Q | Chat Models | 343.9 | 71.13 | 65.7 | 85.22 | 78.78 | 53.64 | 83.03 | 60.42 | LlamaForCausalLM |
| FusionNet_SOLAR | Fine Tuned Models | 159.7 | 71.08 | 71.59 | 88.4 | 65.29 | 69.21 | 81.06 | 50.95 | LlamaForCausalLM |
| Nous-Hermes-2-SOLAR-10.7B-x2-MoE | Fine Tuned Models | 191.9 | 71.08 | 67.15 | 84.83 | 66.52 | 55.85 | 83.11 | 68.99 | MixtralForCausalLM |
| MetaMath-bagel-34b-v0.2-c1500 | Fine Tuned Models | 343.9 | 71.06 | 63.91 | 82.43 | 74.51 | 53.7 | 80.98 | 70.81 | LlamaForCausalLM |
| Yi-34B-200K-AEZAKMI-RAW-1701 | Fine Tuned Models | 340 | 71.04 | 66.81 | 85.79 | 75.44 | 57.91 | 80.35 | 59.97 | LlamaForCausalLM |
| shqiponja-15b-v1 | Fine Tuned Models | 150 | 71.03 | 66.38 | 85.26 | 64.62 | 56.81 | 84.06 | 69.07 | MixtralForCausalLM |
| Metabird-7B | Fine Tuned Models | 72.4 | 71.03 | 69.54 | 87.54 | 65.27 | 57.94 | 83.03 | 62.85 | MistralForCausalLM |
| Yi-34B-200K-AEZAKMI-v2 | Chat Models | 343.9 | 71.0 | 67.92 | 85.61 | 75.22 | 56.74 | 81.61 | 58.91 | LlamaForCausalLM |
| Nous-Hermes-2-SOLAR-10.7B | Fine Tuned Models | 107.3 | 71.0 | 66.72 | 84.89 | 66.3 | 55.82 | 82.79 | 69.45 | LlamaForCausalLM |
| yi-34b-200k-rawrr-dpo-1 | Chat Models | 343.9 | 70.97 | 65.44 | 85.69 | 76.09 | 54.0 | 82.79 | 61.79 | LlamaForCausalLM |
| Yi-34B-Llama | Pretrained Models | 343.9 | 70.95 | 64.59 | 85.63 | 76.31 | 55.6 | 82.79 | 60.8 | LlamaForCausalLM |
| openbuddy-mixtral-7bx8-v18.1-32k | Chat Models | 467.4 | 70.95 | 67.66 | 84.3 | 70.94 | 56.72 | 80.98 | 65.13 | MixtralForCausalLM |
| Marcoroni-7B-v2 | Fine Tuned Models | 70 | 70.92 | 68.26 | 86.27 | 63.39 | 61.96 | 80.11 | 65.5 | Unknown |
| Draco-8x7B | Fine Tuned Models | 467 | 70.89 | 65.02 | 85.24 | 64.96 | 62.65 | 80.66 | 66.79 | MixtralForCausalLM |
| flux-7b-v0.1 | Fine Tuned Models | 72.4 | 70.85 | 67.06 | 86.18 | 65.4 | 55.05 | 79.01 | 72.4 | MistralForCausalLM |
| MiaAffogato-Indo-Mistral-7b | Fine Tuned Models | 70 | 70.83 | 66.38 | 85.43 | 64.11 | 58.18 | 83.19 | 67.7 | MistralForCausalLM |
| Yi-34B-200K | Pretrained Models | 343.9 | 70.81 | 65.36 | 85.58 | 76.06 | 53.64 | 82.56 | 61.64 | LlamaForCausalLM |
| stealth-v1.2 | Fine Tuned Models | 72.4 | 70.68 | 66.38 | 86.14 | 64.33 | 54.23 | 80.74 | 72.25 | MistralForCausalLM |
| internlm2-20b-llama | Pretrained Models | 198.6 | 70.66 | 64.59 | 83.12 | 67.27 | 54.13 | 84.21 | 70.66 | LlamaForCausalLM |
| internlm2-20b-llama | Pretrained Models | 198.6 | 70.61 | 64.68 | 83.16 | 67.17 | 54.17 | 84.29 | 70.2 | L;l;a;m;a;F;o;r;C;a;u;s;a;l;L;M |
| strix-rufipes-70b | Fine Tuned Models | 689.8 | 70.61 | 71.33 | 87.86 | 69.13 | 56.72 | 84.77 | 53.83 | LlamaForCausalLM |
| dolphin-2.2-70b | Chat Models | 700 | 70.6 | 70.05 | 85.97 | 69.18 | 60.14 | 81.45 | 56.79 | Unknown |
| ChatAllInOne-Yi-34B-200K-V1 | Fine Tuned Models | 343.9 | 70.56 | 65.96 | 84.53 | 74.13 | 56.96 | 82.72 | 59.06 | LlamaForCausalLM |
| ChatAllInOne-Yi-34B-200K-V1 | Fine Tuned Models | 343.9 | 70.55 | 65.96 | 84.58 | 73.95 | 56.82 | 82.48 | 59.51 | LlamaForCausalLM |
| kaori-70b-v1 | Fine Tuned Models | 700 | 70.54 | 69.8 | 87.36 | 70.82 | 58.81 | 84.06 | 52.39 | LlamaForCausalLM |
| Pallas-0.2 | Chat Models | 343.9 | 70.51 | 64.59 | 83.44 | 75.53 | 55.29 | 81.61 | 62.62 | LlamaForCausalLM |
| Pallas-0.2 | Chat Models | 343.9 | 70.49 | 64.51 | 83.47 | 75.64 | 55.27 | 81.37 | 62.7 | LlamaForCausalLM |
| Mixtral-8x7b-DPO-v0.1 | Chat Models | 467 | 70.45 | 70.9 | 87.61 | 70.66 | 57.38 | 82.4 | 53.75 | MixtralForCausalLM |
| Konstanta-Gamma-10.9B | Merged Models or MoE Models | 109.5 | 70.44 | 68.26 | 87.38 | 64.5 | 64.18 | 80.98 | 57.32 | MistralForCausalLM |
| Chupacabra-7B-v2.01 | Fine Tuned Models | 72.4 | 70.43 | 68.86 | 86.12 | 63.9 | 63.5 | 80.51 | 59.67 | MistralForCausalLM |
| Chupacabra-8x7B-MoE | Fine Tuned Models | 467 | 70.4 | 68.77 | 86.11 | 63.86 | 63.5 | 80.51 | 59.67 | MixtralForCausalLM |
| Tulpar-7b-v2 | Fine Tuned Models | 72.4 | 70.36 | 67.49 | 84.89 | 63.02 | 63.65 | 79.48 | 63.61 | MistralForCausalLM |
| OpenAGI-7B-v0.1 | Chat Models | 72.4 | 70.34 | 66.72 | 86.13 | 63.53 | 69.55 | 79.48 | 56.63 | MistralForCausalLM |
| ShiningValiant | Fine Tuned Models | 689.8 | 70.34 | 68.69 | 87.31 | 69.64 | 55.78 | 84.14 | 56.48 | LlamaForCausalLM |
| firefly-mixtral-8x7b-v1 | Fine Tuned Models | 467 | 70.34 | 68.09 | 85.76 | 71.49 | 55.31 | 82.08 | 59.29 | Unknown |
| firefly-mixtral-8x7b-v0.1 | Fine Tuned Models | 467 | 70.34 | 68.09 | 85.76 | 71.49 | 55.31 | 82.08 | 59.29 | Unknown |
| una-neural-chat-v3-3-P1-OMA | Fine Tuned Models | 0 | 70.32 | 66.81 | 85.92 | 63.37 | 64.35 | 79.64 | 61.87 | MistralForCausalLM |
| CapybaraMarcoroni-7B | Fine Tuned Models | 72.4 | 70.32 | 65.02 | 84.81 | 65.2 | 57.07 | 81.14 | 68.69 | MistralForCausalLM |
| SauerkrautLM-7b-LaserChat | Chat Models | 72.4 | 70.32 | 67.58 | 83.58 | 64.93 | 56.08 | 80.9 | 68.84 | MistralForCausalLM |
| Tess-34B-v1.5b | Fine Tuned Models | 340 | 70.31 | 63.91 | 84.43 | 76.26 | 53.12 | 81.29 | 62.85 | LlamaForCausalLM |
| flux-7b-v0.2 | Fine Tuned Models | 72.4 | 70.3 | 66.55 | 86.12 | 65.38 | 51.8 | 79.32 | 72.63 | MistralForCausalLM |
| caigun-lora-model-34B-v3 | Fine Tuned Models | 343.9 | 70.27 | 66.89 | 84.77 | 75.41 | 56.47 | 83.58 | 54.51 | LlamaForCausalLM |
| SynthIA-70B-v1.5 | Fine Tuned Models | 700 | 70.23 | 69.37 | 86.97 | 69.16 | 57.4 | 83.66 | 54.81 | LlamaForCausalLM |
| Pallas-0.5-LASER-0.1 | Fine Tuned Models | 0 | 70.23 | 64.68 | 83.49 | 74.94 | 56.78 | 81.29 | 60.2 | LlamaForCausalLM |
| Pallas-0.5 | Chat Models | 343.9 | 70.22 | 64.76 | 83.46 | 75.01 | 56.88 | 81.29 | 59.89 | LlamaForCausalLM |
| MetaMath-Chupacabra-7B-v2.01-Slerp | Fine Tuned Models | 72.4 | 70.21 | 66.13 | 85.46 | 63.92 | 56.15 | 79.48 | 70.13 | Unknown |
| MetaMath-Tulpar-7b-v2-Slerp | Fine Tuned Models | 72.4 | 70.2 | 65.61 | 85.16 | 63.49 | 56.5 | 79.48 | 70.96 | Unknown |
| OpenHermes-2.5-neural-chat-v3-2-Slerp | Fine Tuned Models | 72.4 | 70.2 | 67.49 | 85.42 | 64.13 | 61.05 | 80.03 | 63.08 | Unknown |
| chinese-mixtral-instruct | Fine Tuned Models | 467 | 70.19 | 67.75 | 85.67 | 71.53 | 57.46 | 83.11 | 55.65 | MixtralForCausalLM |
| Yi-34B-200K-AEZAKMI-RAW-2301 | Chat Models | 343.9 | 70.12 | 66.04 | 84.7 | 74.89 | 56.89 | 81.14 | 57.09 | LlamaForCausalLM |
| MetaMath-OpenHermes-2.5-neural-chat-v3-3-Slerp | Fine Tuned Models | 72.4 | 70.11 | 64.59 | 85.39 | 64.27 | 55.14 | 79.64 | 71.65 | Unknown |
| Tess-34B-v1.4 | Fine Tuned Models | 343.9 | 70.11 | 64.59 | 83.37 | 75.02 | 56.79 | 81.22 | 59.67 | LlamaForCausalLM |
| Pallas-0.4 | Chat Models | 343.9 | 70.08 | 63.65 | 83.3 | 74.93 | 57.26 | 80.43 | 60.88 | LlamaForCausalLM |
| Pallas-0.3 | Chat Models | 343.9 | 70.06 | 63.74 | 83.3 | 75.08 | 57.31 | 80.66 | 60.27 | LlamaForCausalLM |
| FashionGPT-70B-V1.1 | Fine Tuned Models | 700 | 70.05 | 71.76 | 88.2 | 70.99 | 65.26 | 82.64 | 41.47 | LlamaForCausalLM |
| Pallas-0.4 | Chat Models | 343.9 | 70.04 | 63.65 | 83.3 | 75.11 | 57.29 | 80.58 | 60.27 | LlamaForCausalLM |
| OpenMia-Indo-Engineering | Fine Tuned Models | 72.4 | 70.03 | 67.15 | 85.01 | 62.86 | 57.94 | 82.32 | 64.9 | MistralForCausalLM |
| Pallas-0.5-LASER-0.2 | Fine Tuned Models | 343.9 | 70.01 | 64.68 | 83.49 | 74.84 | 56.76 | 81.37 | 58.91 | LlamaForCausalLM |
| RolePlayLake-7B-Toxic | Fine Tuned Models | 70 | 70.0 | 66.98 | 84.86 | 63.79 | 56.54 | 82.24 | 65.58 | MistralForCausalLM |
| Solstice-11B-v1 | Fine Tuned Models | 110 | 69.97 | 70.56 | 87.39 | 65.98 | 61.98 | 83.11 | 50.8 | LlamaForCausalLM |
| openchat-nectar-0.1 | Fine Tuned Models | 72.4 | 69.94 | 66.21 | 82.99 | 65.17 | 54.22 | 81.37 | 69.67 | MistralForCausalLM |
| Pallas-0.3 | Chat Models | 343.9 | 69.88 | 63.57 | 83.36 | 75.09 | 57.32 | 80.19 | 59.74 | LlamaForCausalLM |
| PlatYi-34B-Q | Chat Models | 343.9 | 69.86 | 66.89 | 85.14 | 77.66 | 53.03 | 82.48 | 53.98 | LlamaForCausalLM |
| neural-chat-7b-v3-3 | Fine Tuned Models | 70 | 69.83 | 66.89 | 85.26 | 63.07 | 63.01 | 79.64 | 61.11 | MistralForCausalLM |
| Chupacabra-7B-v2.02 | Fine Tuned Models | 72.4 | 69.82 | 67.66 | 83.9 | 61.98 | 64.06 | 79.4 | 61.94 | MistralForCausalLM |
| SOLAR-10.7B-Instruct-DPO-v1.0 | Fine Tuned Models | 7 | 69.81 | 73.12 | 89.77 | 64.21 | 73.27 | 81.93 | 36.54 | Unknown |
| Tess-M-v1.1 | Fine Tuned Models | 343.9 | 69.79 | 67.15 | 84.76 | 74.5 | 54.8 | 82.87 | 54.66 | LlamaForCausalLM |
| internlm2-20b | Pretrained Models | 200 | 69.75 | 62.97 | 83.21 | 67.58 | 51.27 | 85.56 | 67.93 | Unknown |
| airoboros-l2-70b-3.1.2 | Fine Tuned Models | 689.8 | 69.74 | 70.14 | 86.88 | 69.72 | 59.19 | 83.11 | 49.43 | LlamaForCausalLM |
| Tess-M-v1.3 | Fine Tuned Models | 0 | 69.71 | 62.54 | 83.95 | 75.36 | 56.03 | 81.14 | 59.21 | LlamaForCausalLM |
| bagel-34b-v0.2 | Fine Tuned Models | 343.9 | 69.7 | 68.77 | 83.72 | 76.45 | 59.26 | 83.82 | 46.17 | LlamaForCausalLM |
| Rabbit-7B-DPO-Chat | Chat Models | 70 | 69.69 | 70.31 | 87.43 | 60.5 | 62.18 | 79.16 | 58.53 | MistralForCausalLM |
| openchat-nectar-0.5 | Fine Tuned Models | 72.4 | 69.67 | 66.72 | 83.53 | 65.36 | 52.15 | 82.08 | 68.16 | MistralForCausalLM |
| una-cybertron-7b-v2-bf16 | Fine Tuned Models | 72.4 | 69.67 | 68.26 | 85.85 | 63.23 | 64.63 | 80.98 | 55.04 | MistralForCausalLM |
| CCK_gony | Fine Tuned Models | 467 | 69.61 | 69.11 | 86.78 | 69.43 | 56.74 | 81.53 | 54.06 | MixtralForCausalLM |
| Yi-34B-200K-AEZAKMI-RAW-2901 | Chat Models | 343.9 | 69.59 | 64.93 | 84.98 | 73.7 | 55.09 | 79.32 | 59.51 | LlamaForCausalLM |
| Pandora-13B-v1 | Fine Tuned Models | 124.8 | 69.59 | 67.06 | 87.53 | 63.65 | 65.77 | 80.51 | 52.99 | MistralForCausalLM |
| orthorus-125b-moe | Fine Tuned Models | 1253.5 | 69.58 | 67.66 | 85.52 | 68.94 | 56.27 | 82.32 | 56.79 | MixtralForCausalLM |
| DPOpenHermes-7B-v2 | Chat Models | 72.4 | 69.58 | 66.64 | 85.22 | 63.64 | 59.22 | 79.16 | 63.61 | MistralForCausalLM |
| Qwen-72B-Llama | Pretrained Models | 722.9 | 69.53 | 64.85 | 83.27 | 73.66 | 57.6 | 81.53 | 56.25 | LlamaForCausalLM |
| una-cybertron-7b-v1-fp16 | Fine Tuned Models | 72.4 | 69.49 | 68.43 | 85.42 | 63.34 | 63.28 | 81.37 | 55.12 | MistralForCausalLM |
| openchat-3.5-0106-laser | Fine Tuned Models | 72.4 | 69.46 | 66.04 | 83.18 | 65.11 | 52.08 | 81.45 | 68.92 | MistralForCausalLM |
| saulgoodman-2x7b-alpha1 | Fine Tuned Models | 70 | 69.43 | 66.21 | 85.36 | 64.95 | 60.06 | 79.24 | 60.73 | MixtralForCausalLM |
| Yi-34B | Pretrained Models | 343.9 | 69.42 | 64.59 | 85.69 | 76.35 | 56.23 | 83.03 | 50.64 | LlamaForCausalLM |
| yi-34b-200k-rawrr-dpo-2 | Fine Tuned Models | 343.9 | 69.42 | 64.68 | 84.74 | 75.96 | 46.15 | 83.19 | 61.79 | LlamaForCausalLM |
| Bald-Eagle-7B | Fine Tuned Models | 72.4 | 69.39 | 64.51 | 84.79 | 64.39 | 54.65 | 80.98 | 67.02 | MistralForCausalLM |
| saulgoodman-7b-alpha1 | Fine Tuned Models | 72.4 | 69.38 | 65.7 | 85.5 | 65.19 | 61.13 | 79.01 | 59.74 | MistralForCausalLM |
| deepseek-llm-67b-base | Pretrained Models | 670 | 69.38 | 65.44 | 87.1 | 71.78 | 51.08 | 84.14 | 56.71 | LlamaForCausalLM |
| Sensualize-Mixtral-bf16 | Fine Tuned Models | 467 | 69.37 | 70.14 | 86.6 | 70.89 | 54.17 | 82.4 | 52.01 | MixtralForCausalLM |
| Rabbit-7B-v2-DPO-Chat | Chat Models | 72.4 | 69.36 | 66.13 | 85.18 | 62.92 | 67.06 | 79.24 | 55.65 | MistralForCausalLM |
| openbuddy-deepseek-67b-v15-base | Fine Tuned Models | 674.2 | 69.34 | 66.3 | 86.03 | 70.97 | 52.31 | 83.58 | 56.86 | LlamaForCausalLM |
| MetaModel_moe_multilingualv1 | Chat Models | 467 | 69.33 | 67.58 | 84.72 | 63.77 | 61.21 | 77.35 | 61.33 | MixtralForCausalLM |
| openchat-3.5-0106-32k | Fine Tuned Models | 72.4 | 69.3 | 66.04 | 82.93 | 65.04 | 51.9 | 81.77 | 68.16 | MistralForCausalLM |
| Platypus2-70B-instruct | Fine Tuned Models | 689.8 | 69.3 | 71.84 | 87.94 | 70.48 | 62.26 | 82.72 | 40.56 | LlamaForCausalLM |