加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| PiVoT-0.1-Evil-a | Chat Models | 72.4 | 59.16 | 59.64 | 81.48 | 58.94 | 39.23 | 75.3 | 40.41 | MistralForCausalLM |
| Karen_TheEditor_V2_STRICT_Mistral_7B | Chat Models | 72.4 | 59.13 | 59.56 | 81.79 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 59.56 |
| 49.36 |
| 74.35 |
| 30.17 |
| MistralForCausalLM |
| Mistral-v0.1-PeanutButter-v0.0.0-7B | Chat Models | 72.4 | 59.09 | 62.2 | 84.1 | 64.14 | 46.94 | 78.69 | 18.5 | Unknown |
| Platypus-30B | Chat Models | 325.3 | 59.03 | 64.59 | 84.26 | 64.23 | 45.35 | 81.37 | 14.4 | LlamaForCausalLM |
| zephyr-alpha-Nebula-v2-7B | Chat Models | 72.4 | 59.01 | 58.62 | 83.05 | 56.68 | 58.28 | 73.56 | 23.88 | MistralForCausalLM |
| Hercules-1.0-Mistral-7B | Chat Models | 72.4 | 58.95 | 57.08 | 81.13 | 58.98 | 49.47 | 77.19 | 29.87 | MistralForCausalLM |
| Mistral-7B-v0.1-Open-Platypus | Chat Models | 70 | 58.92 | 62.37 | 85.08 | 63.79 | 47.33 | 77.66 | 17.29 | MistralForCausalLM |
| mistral_7b_norobots | Chat Models | 70 | 58.85 | 58.96 | 80.57 | 57.66 | 41.91 | 75.61 | 38.36 | Unknown |
| Nebula-v2-7B | Chat Models | 72.4 | 58.82 | 58.7 | 83.06 | 57.61 | 46.72 | 75.14 | 31.69 | Unknown |
| Dolphin-Nebula-7B | Chat Models | 72.4 | 58.69 | 55.2 | 78.57 | 53.44 | 57.97 | 73.88 | 33.06 | MistralForCausalLM |
| Mistral-v0.1-PeanutButter-v0.0.2-7B | Chat Models | 72.4 | 58.66 | 61.77 | 84.11 | 64.38 | 45.92 | 78.37 | 17.44 | Unknown |
| SOLAR-Platypus-10.7B-v1 | Chat Models | 107.3 | 58.62 | 61.69 | 84.23 | 60.37 | 51.58 | 82.79 | 11.07 | LlamaForCausalLM |
| koOpenChat-sft | Chat Models | 0 | 58.61 | 59.81 | 78.73 | 61.32 | 51.24 | 76.4 | 24.18 | MistralForCausalLM |
| Mistral-v0.1-PeanutButter-v0.0.5-SFT-7B-QLoRA | Chat Models | 70 | 58.24 | 60.75 | 84.24 | 63.66 | 44.94 | 78.69 | 17.13 | Unknown |
| mistral-7b-platypus1k | Chat Models | 72.4 | 58.19 | 61.6 | 82.93 | 63.16 | 46.96 | 78.14 | 16.38 | MistralForCausalLM |
| llama-33B-instructed | Chat Models | 330 | 58.18 | 64.59 | 86.17 | 60.5 | 44.12 | 79.32 | 14.4 | LlamaForCausalLM |
| llama-megamerge-dare-13b | Chat Models | 130.2 | 58.15 | 60.58 | 83.0 | 54.91 | 45.76 | 76.16 | 28.51 | Unknown |
| Mistral-7B-openplatypus-1k | Chat Models | 70 | 58.07 | 60.15 | 84.25 | 59.84 | 49.86 | 76.87 | 17.44 | LlamaForCausalLM |
| Luban-Marcoroni-13B | Chat Models | 130.2 | 57.98 | 63.65 | 82.92 | 58.7 | 55.55 | 77.03 | 10.01 | Unknown |
| samantha-mistral-7b | Chat Models | 71.1 | 57.96 | 63.4 | 84.1 | 61.36 | 46.08 | 76.8 | 16.0 | Unknown |
| Luban-Marcoroni-13B-v3 | Chat Models | 130.2 | 57.94 | 63.74 | 82.88 | 58.64 | 55.56 | 76.87 | 9.93 | LlamaForCausalLM |
| Luban-Marcoroni-13B-v2 | Chat Models | 130.2 | 57.92 | 63.48 | 82.89 | 58.72 | 55.56 | 76.95 | 9.93 | LlamaForCausalLM |
| Chat-AYB-Nova-13B | Chat Models | 130.2 | 57.84 | 62.97 | 84.28 | 58.58 | 51.28 | 77.58 | 12.36 | Unknown |
| Michel-13B | Chat Models | 130.2 | 57.56 | 61.26 | 83.21 | 55.05 | 50.43 | 75.22 | 20.17 | LlamaForCausalLM |
| airoboros-33b-gpt4-1.3 | Chat Models | 330 | 57.43 | 63.91 | 85.04 | 58.53 | 45.36 | 78.69 | 13.04 | LlamaForCausalLM |
| chinese-alpaca-2-13b | Chat Models | 130 | 57.41 | 58.7 | 79.76 | 55.12 | 50.22 | 75.61 | 25.02 | LlamaForCausalLM |
| OpenOrca-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 57.31 | 62.37 | 82.99 | 59.38 | 52.2 | 75.77 | 11.14 | Unknown |
| 2x-LoRA-Assemble-Nova-13B | Chat Models | 130.2 | 57.26 | 62.63 | 83.24 | 58.64 | 51.88 | 76.95 | 10.24 | Unknown |
| Giraffe-13b-32k-v3 | Chat Models | 130.2 | 57.24 | 59.04 | 79.59 | 55.01 | 46.68 | 76.95 | 26.16 | LlamaForCausalLM |
| orca_mini_v3_13b | Chat Models | 130 | 57.24 | 63.14 | 82.35 | 56.52 | 51.81 | 76.48 | 13.12 | LlamaForCausalLM |
| airoboros-33b-2.1 | Chat Models | 330 | 57.16 | 63.65 | 84.97 | 57.37 | 52.17 | 78.22 | 6.6 | LlamaForCausalLM |
| YuLan-Chat-2-13b-fp16 | Chat Models | 130 | 57.01 | 59.04 | 80.66 | 56.72 | 52.18 | 79.64 | 13.8 | LlamaForCausalLM |
| AISquare-Instruct-llama2-koen-13b-v0.9.24 | Chat Models | 131.6 | 56.98 | 55.63 | 81.35 | 51.76 | 53.0 | 76.95 | 23.2 | LlamaForCausalLM |
| GenAI-Nova-13B | Chat Models | 130.2 | 56.98 | 62.29 | 83.27 | 59.47 | 51.79 | 77.35 | 7.73 | Unknown |
| LosslessMegaCoder-llama2-13b-mini | Chat Models | 130 | 56.92 | 60.58 | 81.26 | 57.92 | 48.89 | 76.95 | 15.92 | LlamaForCausalLM |
| ghost-7b-v0.9.0 | Chat Models | 72.4 | 56.89 | 53.07 | 77.93 | 55.09 | 47.79 | 73.72 | 33.74 | MistralForCausalLM |
| Orca-Nova-13B | Chat Models | 130.2 | 56.72 | 62.37 | 82.47 | 57.44 | 45.97 | 77.58 | 14.48 | Unknown |
| storytime-13b | Chat Models | 130.2 | 56.64 | 62.03 | 83.96 | 57.48 | 52.5 | 75.53 | 8.34 | LlamaForCausalLM |
| EnsembleV5-Nova-13B | Chat Models | 130.2 | 56.49 | 62.71 | 82.55 | 56.79 | 49.86 | 76.24 | 10.77 | Unknown |
| EnsembleV5-Nova-13B | Chat Models | 130.2 | 56.49 | 62.71 | 82.55 | 56.79 | 49.86 | 76.24 | 10.77 | Unknown |
| SOLAR_KO_1.3_deup | Chat Models | 108.5 | 56.47 | 55.97 | 79.97 | 55.88 | 47.55 | 76.87 | 22.59 | LlamaForCausalLM |
| Nova-13B | Chat Models | 130.2 | 56.44 | 62.71 | 82.57 | 57.98 | 51.34 | 77.27 | 6.75 | Unknown |
| airoboros-l2-13b-2.2.1 | Chat Models | 130 | 56.36 | 60.92 | 83.77 | 56.47 | 49.42 | 76.01 | 11.6 | LlamaForCausalLM |
| mistral-7b_open_platypus | Chat Models | 70 | 56.29 | 55.8 | 82.13 | 59.76 | 48.87 | 78.61 | 12.59 | MistralForCausalLM |
| Synatra-11B-Testbench | Chat Models | 110 | 56.17 | 57.34 | 78.66 | 55.56 | 51.97 | 75.77 | 17.74 | Unknown |
| Orca-2-13b-SFT-v6 | Chat Models | 130.2 | 56.15 | 60.41 | 80.46 | 59.51 | 54.01 | 77.43 | 5.08 | LlamaForCausalLM |
| SpeechlessV1-Nova-13B | Chat Models | 130.2 | 56.14 | 61.77 | 82.68 | 57.75 | 51.44 | 77.43 | 5.76 | Unknown |
| Nebula-7B | Chat Models | 72.4 | 56.1 | 59.3 | 83.46 | 57.0 | 45.56 | 76.4 | 14.86 | Unknown |
| huginnv1.2 | Chat Models | 128.5 | 55.98 | 62.37 | 84.28 | 57.02 | 47.81 | 75.22 | 9.17 | Unknown |
| Walter-SOLAR-11B | Chat Models | 107.3 | 55.95 | 60.41 | 84.86 | 64.99 | 44.88 | 79.56 | 0.99 | LlamaForCausalLM |
| Chat-AYB-Platypus2-13B | Chat Models | 130.2 | 55.93 | 60.49 | 84.03 | 57.83 | 54.52 | 75.77 | 2.96 | Unknown |
| Synatra-V0.1-7B-Instruct | Chat Models | 70 | 55.86 | 55.29 | 76.63 | 55.29 | 55.76 | 72.77 | 19.41 | MistralForCausalLM |
| Stable-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 55.56 | 62.29 | 82.46 | 57.09 | 51.41 | 76.56 | 3.56 | Unknown |
| minotaur-llama2-13b-qlora | Chat Models | 130 | 55.37 | 60.07 | 82.42 | 55.87 | 45.57 | 76.24 | 12.05 | Unknown |
| Luban-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 55.34 | 60.24 | 82.22 | 58.03 | 55.26 | 75.37 | 0.91 | Unknown |
| 2x-LoRA-Assemble-Platypus2-13B | Chat Models | 130.2 | 55.33 | 60.58 | 82.56 | 58.25 | 54.77 | 74.9 | 0.91 | Unknown |
| SOLAR-Platypus-10.7B-v2 | Chat Models | 107.3 | 55.25 | 59.39 | 83.57 | 59.93 | 43.15 | 81.45 | 4.02 | LlamaForCausalLM |
| OrcaMini-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 55.22 | 60.84 | 82.56 | 56.42 | 53.32 | 75.93 | 2.27 | Unknown |
| zephyr_7b_norobots | Chat Models | 70 | 55.16 | 56.48 | 79.64 | 55.52 | 44.6 | 74.11 | 20.62 | Unknown |
| airoboros-c34b-2.2.1 | Chat Models | 340 | 55.15 | 54.69 | 76.84 | 55.43 | 51.36 | 72.53 | 20.02 | LlamaForCausalLM |
| Llama-2-13B-Instruct-v0.2 | Chat Models | 130 | 55.14 | 60.58 | 81.96 | 55.46 | 45.71 | 77.82 | 9.33 | ? |
| Mistral-7B-Instruct-v0.1 | Chat Models | 72.4 | 54.96 | 54.52 | 75.63 | 55.38 | 56.28 | 73.72 | 14.25 | MistralForCausalLM |
| Sydney_Overthinker_13b_HF | Chat Models | 130.2 | 54.94 | 58.96 | 80.85 | 51.28 | 45.7 | 73.95 | 18.88 | LlamaForCausalLM |
| Mistral-7B-AEZAKMI-v1 | Chat Models | 72.4 | 54.92 | 58.87 | 82.01 | 58.72 | 53.54 | 75.69 | 0.68 | MistralForCausalLM |
| OpenOrcaPlatypus2-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 54.86 | 59.81 | 82.69 | 56.96 | 52.92 | 74.43 | 2.35 | Unknown |
| Ensemble5-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 54.76 | 59.73 | 82.66 | 56.94 | 52.92 | 74.43 | 1.9 | Unknown |
| MythoMix-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 54.74 | 60.32 | 83.72 | 55.74 | 52.18 | 75.53 | 0.91 | Unknown |
| llama-2-13B-instructed | Chat Models | 130 | 54.63 | 59.39 | 83.88 | 55.57 | 46.89 | 74.03 | 8.04 | LlamaForCausalLM |
| Llama-2-13b-hf-ds_eli5_1024_r_64_alpha_16 | Chat Models | 130 | 54.61 | 60.41 | 82.58 | 55.86 | 43.61 | 76.72 | 8.49 | Unknown |
| Nous-Hermes-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 54.6 | 59.9 | 83.29 | 56.69 | 51.08 | 75.22 | 1.44 | Unknown |
| Samantha-Nebula-7B | Chat Models | 72.4 | 54.58 | 57.0 | 82.25 | 54.21 | 49.58 | 73.09 | 11.37 | MistralForCausalLM |
| Camelidae-8x7B | Chat Models | 70 | 54.47 | 55.63 | 79.18 | 50.1 | 42.86 | 76.24 | 22.82 | LlamaForCausalLM |
| Limarp-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 54.46 | 60.49 | 82.76 | 56.52 | 44.14 | 76.8 | 6.07 | Unknown |
| EverythingLM-13b-V3-peft | Chat Models | 128.5 | 54.24 | 58.36 | 81.03 | 54.7 | 52.98 | 72.85 | 5.53 | Unknown |
| llama-2-13b-hf-platypus | Chat Models | 130.2 | 54.22 | 58.87 | 82.14 | 54.98 | 42.84 | 77.11 | 9.4 | LlamaForCausalLM |
| Alpagasus-2-13b-QLoRA-merged | Chat Models | 130 | 54.2 | 60.84 | 82.43 | 55.55 | 38.65 | 76.87 | 10.84 | LlamaForCausalLM |
| PuddleJumper-13b-V2 | Chat Models | 130 | 54.19 | 57.0 | 81.06 | 58.3 | 52.66 | 72.45 | 3.64 | LlamaForCausalLM |
| Llama-2-13b-hf-eli5-wiki-1024_r_64_alpha_16 | Chat Models | 130 | 54.14 | 59.98 | 82.43 | 55.41 | 39.9 | 76.56 | 10.54 | Unknown |
| chinese-alpaca-2-13b-16k | Chat Models | 130 | 54.12 | 55.03 | 77.41 | 51.28 | 46.5 | 73.4 | 21.08 | LlamaForCausalLM |
| mnsim-dpo-peftmerged-2-eos | Chat Models | 131.6 | 54.04 | 55.63 | 77.82 | 51.25 | 46.37 | 76.24 | 16.91 | LlamaForCausalLM |
| MythicalDestroyerV2-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 54.01 | 57.34 | 81.24 | 55.64 | 55.98 | 73.88 | 0.0 | Unknown |
| WizardMath-13B-V1.0 | Chat Models | 130 | 53.97 | 60.07 | 82.01 | 54.8 | 42.7 | 71.9 | 12.36 | LlamaForCausalLM |
| Platypus-Nebula-v2-7B | Chat Models | 72.4 | 53.95 | 55.38 | 83.02 | 56.07 | 46.94 | 72.22 | 10.08 | MistralForCausalLM |
| Ferret-7B | Chat Models | 70 | 53.93 | 62.29 | 81.31 | 60.27 | 40.01 | 77.66 | 2.05 | Unknown |
| llama-2-13b-chat-platypus | Chat Models | 130.2 | 53.92 | 53.84 | 80.67 | 54.44 | 46.23 | 76.01 | 12.36 | LlamaForCausalLM |
| Ferret_7B | Chat Models | 70 | 53.87 | 62.29 | 81.33 | 60.09 | 39.94 | 77.51 | 2.05 | MistralForCausalLM |
| Ferret-7B | Chat Models | 70 | 53.87 | 62.29 | 81.33 | 60.09 | 39.94 | 77.51 | 2.05 | Unknown |
| Libra-19B | Chat Models | 190 | 53.83 | 60.58 | 82.04 | 55.57 | 48.41 | 76.32 | 0.08 | LlamaForCausalLM |
| GiftedConvo13bLoraNoEconsE4 | Chat Models | 130 | 53.74 | 59.9 | 84.11 | 54.67 | 41.94 | 74.03 | 7.81 | Unknown |
| mistral-7b-sft-open-orca-flan-50k | Chat Models | 72.4 | 53.7 | 58.79 | 81.92 | 55.72 | 37.49 | 77.98 | 10.31 | MistralForCausalLM |
| Llama-2-13b-chat-dutch | Chat Models | 130.2 | 53.69 | 59.3 | 81.45 | 55.82 | 38.23 | 76.64 | 10.69 | LlamaForCausalLM |
| platypus2-22b-relora | Chat Models | 218.3 | 53.64 | 57.51 | 82.36 | 54.94 | 43.62 | 77.11 | 6.29 | Unknown |
| deacon-13b | Chat Models | 130.2 | 53.63 | 57.85 | 82.63 | 55.25 | 39.33 | 76.32 | 10.39 | LlamaForCausalLM |
| tora-13b-v1.0 | Chat Models | 130 | 53.62 | 58.96 | 82.31 | 54.73 | 40.25 | 75.61 | 9.86 | LlamaForCausalLM |
| MistralInstructLongish | Chat Models | 72.4 | 53.62 | 60.75 | 81.86 | 60.49 | 40.55 | 76.56 | 1.52 | MistralForCausalLM |
| WizardLM-1.0-Uncensored-CodeLlama-34b | Chat Models | 334.8 | 53.59 | 56.4 | 75.45 | 54.51 | 43.06 | 72.45 | 19.64 | Unknown |
| code-millenials-34b | Chat Models | 337.4 | 53.51 | 49.83 | 75.09 | 49.28 | 45.37 | 69.06 | 32.45 | LlamaForCausalLM |
| orca_mini_v3_7b | Chat Models | 70 | 53.47 | 56.91 | 79.64 | 52.37 | 50.51 | 74.27 | 7.13 | LlamaForCausalLM |
| samantha-mistral-instruct-7b | Chat Models | 71.1 | 53.4 | 53.5 | 75.14 | 51.72 | 58.81 | 70.4 | 10.84 | Unknown |
| GiftedConvo13bLoraNoEcons | Chat Models | 130 | 53.35 | 59.39 | 83.19 | 55.15 | 40.56 | 74.03 | 7.81 | Unknown |