加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| GOAT-70B-Storytelling | Fine Tuned Models | 700 | 67.38 | 68.77 | 87.74 | 69.92 | 53.53 | 83.5 | 40.79 | LlamaForCausalLM |
| TopicNeuralHermes-2.5-Mistral-7B | Fine Tuned Models | 72.4 | 67.36 | 67.06 | 85.44 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 63.66 |
| 55.47 |
| 78.3 |
| 54.21 |
| MistralForCausalLM |
| Synthia-v3.0-11B | Fine Tuned Models | 110 | 67.35 | 64.08 | 85.32 | 66.18 | 48.22 | 84.21 | 56.1 | LlamaForCausalLM |
| Fewshot-Metamath-OrcaVicuna-Mistral | Fine Tuned Models | 72.4 | 67.33 | 59.64 | 81.82 | 61.69 | 53.23 | 78.45 | 69.14 | MistralForCausalLM |
| NeuralHermes-2.5-Mistral-7B-laser | Fine Tuned Models | 72.4 | 67.29 | 66.38 | 85.09 | 63.43 | 54.95 | 78.14 | 55.72 | MistralForCausalLM |
| Mistral-CatMacaroni-slerp-uncensored | Fine Tuned Models | 72.4 | 67.28 | 64.25 | 84.09 | 62.66 | 56.87 | 79.72 | 56.1 | Unknown |
| dolphin-2.6-mistral-7b-dpo-laser | Fine Tuned Models | 72.4 | 67.28 | 66.3 | 85.73 | 63.16 | 61.71 | 79.16 | 47.61 | MistralForCausalLM |
| Samantha-1.11-70b | Chat Models | 687.2 | 67.28 | 70.05 | 87.55 | 67.82 | 65.02 | 83.27 | 29.95 | Unknown |
| Ana-v1-m7 | Fine Tuned Models | 72.4 | 67.24 | 67.41 | 85.98 | 64.43 | 55.03 | 78.06 | 52.54 | MistralForCausalLM |
| Pearl-3x7B | Merged Models or MoE Models | 185.2 | 67.23 | 65.53 | 85.54 | 64.27 | 52.17 | 78.69 | 57.16 | MixtralForCausalLM |
| Tess-10.7B-v1.5b | Fine Tuned Models | 107 | 67.21 | 65.36 | 85.33 | 66.24 | 47.38 | 82.79 | 56.18 | LlamaForCausalLM |
| Pallas-0.5-LASER-0.5 | Fine Tuned Models | 343.9 | 67.21 | 63.48 | 82.21 | 74.31 | 54.64 | 79.64 | 48.98 | LlamaForCausalLM |
| dolphin-2.6-mistral-7b-dpo | Fine Tuned Models | 72.4 | 67.2 | 65.61 | 85.48 | 63.24 | 61.47 | 78.61 | 48.75 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO-recovered | Chat Models | 72.4 | 67.16 | 65.27 | 84.62 | 63.82 | 52.91 | 78.06 | 58.3 | MistralForCausalLM |
| Starling-LM-7B-alpha | Fine Tuned Models | 72.4 | 67.13 | 63.82 | 84.9 | 64.67 | 46.39 | 80.58 | 62.4 | MistralForCausalLM |
| bagel-dpo-7b-v0.4 | Chat Models | 72.4 | 67.13 | 67.58 | 84.3 | 61.95 | 63.94 | 78.14 | 46.85 | MistralForCausalLM |
| lzlv_70b_fp16_hf | Fine Tuned Models | 689.8 | 67.13 | 70.14 | 87.54 | 70.23 | 60.49 | 83.43 | 30.93 | LlamaForCausalLM |
| model_007_v2 | Fine Tuned Models | 687.2 | 67.13 | 71.42 | 87.31 | 68.58 | 62.65 | 84.14 | 28.66 | Unknown |
| Starling-LM-alpha-8x7B-MoE | Fine Tuned Models | 467 | 67.11 | 63.65 | 84.9 | 64.68 | 46.39 | 80.58 | 62.47 | MixtralForCausalLM |
| smol-7b | Fine Tuned Models | 72.4 | 67.11 | 63.74 | 84.77 | 65.0 | 46.17 | 80.66 | 62.32 | MistralForCausalLM |
| Dionysus-Mistral-m3-v6 | Fine Tuned Models | 72.4 | 67.1 | 63.14 | 84.51 | 62.82 | 49.49 | 78.45 | 64.22 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO | Chat Models | 72.4 | 67.1 | 65.27 | 84.62 | 63.83 | 52.91 | 78.06 | 57.92 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B-mt-bench-DPO-corrupted | Chat Models | 72.4 | 67.09 | 65.27 | 84.58 | 63.74 | 52.84 | 78.06 | 58.07 | MistralForCausalLM |
| Starling-LM-7B-alpha | Fine Tuned Models | 72.4 | 67.05 | 63.65 | 84.87 | 64.7 | 46.32 | 80.43 | 62.32 | MistralForCausalLM |
| MetaMath-70B-V1.0 | Fine Tuned Models | 700 | 67.02 | 68.0 | 86.85 | 69.31 | 50.98 | 82.32 | 44.66 | LlamaForCausalLM |
| Synthia-70B-v1.2b | Fine Tuned Models | 700 | 67.0 | 68.77 | 87.57 | 68.81 | 57.69 | 83.9 | 35.25 | LlamaForCausalLM |
| Mixtral-SlimOrca-8x7B | Fine Tuned Models | 467 | 66.97 | 67.66 | 85.11 | 67.98 | 54.98 | 80.51 | 45.56 | MixtralForCausalLM |
| internlm2-7b-llama | Pretrained Models | 77.4 | 66.94 | 60.49 | 80.99 | 63.16 | 54.25 | 79.87 | 62.85 | L;l;a;m;a;F;o;r;C;a;u;s;a;l;L;M |
| neural-chat-7B-v3-2-GPTQ | Fine Tuned Models | 95.9 | 66.93 | 65.96 | 83.24 | 60.29 | 59.79 | 79.48 | 52.84 | MistralForCausalLM |
| Synthia-70B-v1.2 | Fine Tuned Models | 700 | 66.9 | 70.48 | 86.98 | 70.13 | 58.64 | 83.27 | 31.92 | LlamaForCausalLM |
| MoE-Merging | Merged Models or MoE Models | 241.5 | 66.84 | 65.44 | 84.58 | 61.31 | 57.83 | 77.66 | 54.21 | MixtralForCausalLM |
| DPOpenHermes-11B | Fine Tuned Models | 107.3 | 66.83 | 66.55 | 84.8 | 64.02 | 57.34 | 76.95 | 51.33 | MistralForCausalLM |
| Synthia-70B-v1.1 | Fine Tuned Models | 700 | 66.81 | 70.05 | 87.12 | 70.34 | 57.84 | 83.66 | 31.84 | LlamaForCausalLM |
| Frostwind-10.7B-v1 | Fine Tuned Models | 107 | 66.81 | 63.99 | 85.36 | 64.49 | 50.41 | 83.82 | 52.77 | LlamaForCausalLM |
| sonya-medium-x8-MoE | Fine Tuned Models | 699.2 | 66.76 | 64.25 | 83.7 | 62.53 | 60.15 | 76.24 | 53.68 | MixtralForCausalLM |
| Frostwind-10.7B-v1 | Fine Tuned Models | 107 | 66.75 | 64.16 | 85.38 | 64.64 | 50.43 | 83.74 | 52.16 | LlamaForCausalLM |
| Synthia-70B | Fine Tuned Models | 700 | 66.72 | 69.45 | 87.11 | 68.91 | 59.79 | 83.66 | 31.39 | LlamaForCausalLM |
| laser-dolphin-mixtral-4x7b-dpo | Merged Models or MoE Models | 241.5 | 66.71 | 64.93 | 85.81 | 63.04 | 63.77 | 77.82 | 44.88 | MixtralForCausalLM |
| Moe-3x7b-QA-Code-Inst | Fine Tuned Models | 185.2 | 66.7 | 64.25 | 84.6 | 62.15 | 63.15 | 77.43 | 48.6 | MixtralForCausalLM |
| Chinese-Mixtral-8x7B | Pretrained Models | 469.1 | 66.69 | 63.57 | 85.98 | 70.95 | 45.86 | 82.08 | 51.71 | MixtralForCausalLM |
| internlm2-7b | Pretrained Models | 70 | 66.68 | 58.02 | 81.24 | 65.24 | 48.73 | 83.82 | 63.0 | Unknown |
| Bookworm-10.7B-v0.4-DPO | Fine Tuned Models | 108 | 66.66 | 64.68 | 84.43 | 65.12 | 52.38 | 81.14 | 52.24 | LlamaForCausalLM |
| Barcenas-10.7b | Fine Tuned Models | 107.3 | 66.63 | 64.16 | 83.6 | 65.22 | 46.59 | 82.0 | 58.23 | LlamaForCausalLM |
| Pallas-0.5-LASER-0.6 | Fine Tuned Models | 343.9 | 66.62 | 62.46 | 81.6 | 74.25 | 54.39 | 78.45 | 48.6 | LlamaForCausalLM |
| NeuralHermes-2.5-Mistral-7B | Fine Tuned Models | 72.4 | 66.62 | 64.68 | 84.28 | 63.71 | 52.23 | 77.98 | 56.86 | MistralForCausalLM |
| Uni-TianYan | Fine Tuned Models | 0 | 66.61 | 72.1 | 87.4 | 69.91 | 65.81 | 82.32 | 22.14 | LlamaForCausalLM |
| Bookworm-10.7B-v0.4-DPO | Chat Models | 108 | 66.59 | 64.76 | 84.4 | 64.96 | 52.31 | 80.9 | 52.24 | LlamaForCausalLM |
| Adrastea-7b-v1.0-dpo | Fine Tuned Models | 72.4 | 66.59 | 63.31 | 82.3 | 62.26 | 53.1 | 76.56 | 62.02 | ? |
| LMCocktail-Mistral-7B-v1 | Fine Tuned Models | 70 | 66.58 | 66.21 | 85.69 | 61.64 | 61.37 | 77.35 | 47.23 | MistralForCausalLM |
| shark_tank_ai_7b_v2 | Chat Models | 72.4 | 66.55 | 67.75 | 87.06 | 58.79 | 62.15 | 78.45 | 45.11 | MistralForCausalLM |
| Tess-10.7B-v1.5 | Fine Tuned Models | 107 | 66.55 | 65.02 | 84.07 | 65.09 | 47.43 | 83.35 | 54.36 | LlamaForCausalLM |
| shark_tank_ai_7b_v2 | Fine Tuned Models | 72.4 | 66.54 | 67.58 | 87.02 | 58.88 | 62.21 | 78.69 | 44.88 | MistralForCausalLM |
| dpopenhermes-alpha-v0 | Fine Tuned Models | 72.4 | 66.52 | 65.02 | 83.96 | 63.67 | 51.75 | 78.85 | 55.88 | MistralForCausalLM |
| Soniox-7B-v1.0 | Fine Tuned Models | 72.4 | 66.5 | 63.91 | 82.55 | 64.38 | 53.84 | 78.06 | 56.25 | MistralForCausalLM |
| Llamafia | Fine Tuned Models | 72.4 | 66.49 | 66.13 | 82.08 | 61.81 | 47.94 | 80.11 | 60.88 | MistralForCausalLM |
| LLaMA-2-Wizard-70B-QLoRA | Chat Models | 700 | 66.47 | 67.58 | 87.52 | 69.11 | 61.79 | 82.32 | 30.48 | Unknown |
| A13 | Fine Tuned Models | 0 | 66.45 | 61.09 | 81.7 | 69.62 | 53.25 | 80.35 | 52.69 | Unknown |
| Math-OpenHermes-2.5-Mistral-7B | Fine Tuned Models | 72.4 | 66.42 | 63.05 | 83.07 | 63.21 | 50.91 | 77.19 | 61.11 | MistralForCausalLM |
| Instruct_Llama70B_Dolly15k | Fine Tuned Models | 700 | 66.42 | 68.34 | 87.21 | 69.52 | 46.46 | 84.29 | 42.68 | LlamaForCausalLM |
| PiVoT-SOLAR-10.7B-RP | Fine Tuned Models | 107 | 66.42 | 65.1 | 81.83 | 64.26 | 56.54 | 76.95 | 53.83 | LlamaForCausalLM |
| openhermes-2_5-dpo-no-robots | Chat Models | 72.4 | 66.4 | 64.93 | 84.3 | 63.86 | 52.12 | 77.9 | 55.27 | MistralForCausalLM |
| Snorkel-Mistral-PairRM-DPO | Chat Models | 0 | 66.31 | 66.04 | 85.64 | 60.83 | 70.86 | 77.74 | 36.77 | MistralForCausalLM |
| qCammel-70 | Fine Tuned Models | 687.2 | 66.31 | 68.34 | 87.87 | 70.18 | 57.47 | 84.29 | 29.72 | Unknown |
| qCammel-70x | Fine Tuned Models | 687.2 | 66.31 | 68.34 | 87.87 | 70.18 | 57.47 | 84.29 | 29.72 | Unknown |
| qCammel-70v1 | Fine Tuned Models | 687.2 | 66.31 | 68.34 | 87.87 | 70.18 | 57.47 | 84.29 | 29.72 | Unknown |
| qCammel70 | Fine Tuned Models | 687.2 | 66.31 | 68.34 | 87.87 | 70.18 | 57.47 | 84.29 | 29.72 | Unknown |
| qCammel-70-x | Fine Tuned Models | 0 | 66.31 | 68.34 | 87.87 | 70.18 | 57.47 | 84.29 | 29.72 | LlamaForCausalLM |
| Platypus2-70B | Fine Tuned Models | 689.8 | 66.28 | 70.65 | 87.15 | 70.08 | 52.37 | 84.37 | 33.06 | LlamaForCausalLM |
| Xwin-LM-70B-V0.1 | Fine Tuned Models | 700 | 66.2 | 70.22 | 87.25 | 69.77 | 59.86 | 82.87 | 27.22 | LlamaForCausalLM |
| Snorkel-Mistral-PairRM-DPO | Chat Models | 0 | 66.18 | 65.96 | 85.63 | 60.85 | 70.91 | 77.58 | 36.16 | MistralForCausalLM |
| MistralHermes-CodePro-7B-v1 | Fine Tuned Models | 70 | 66.17 | 62.46 | 82.68 | 63.44 | 49.67 | 77.9 | 60.88 | MistralForCausalLM |
| llama-2-70b-fb16-orca-chat-10k | Fine Tuned Models | 689.8 | 66.16 | 68.09 | 87.07 | 69.21 | 61.56 | 84.14 | 26.91 | LlamaForCausalLM |
| mistral-11b-slimorca | Chat Models | 107.3 | 66.12 | 64.25 | 83.81 | 63.66 | 54.66 | 77.98 | 52.39 | MistralForCausalLM |
| where-llambo-7b | Chat Models | 72.4 | 66.08 | 58.45 | 82.06 | 62.61 | 49.61 | 78.53 | 65.2 | MistralForCausalLM |
| NeuralHermes-2.5-Mistral-7B | Chat Models | 72.4 | 66.06 | 67.58 | 85.69 | 63.43 | 55.98 | 77.98 | 45.72 | MistralForCausalLM |
| llama-2-70b-Guanaco-QLoRA-fp16 | Fine Tuned Models | 700 | 66.05 | 68.26 | 88.32 | 70.23 | 55.69 | 83.98 | 29.8 | LlamaForCausalLM |
| KoSOLAR-10.7B-v0.1 | Merged Models or MoE Models | 108.6 | 66.04 | 62.03 | 84.54 | 65.56 | 45.03 | 83.58 | 55.5 | Unknown |
| SOLAR-10.7B-v1.0 | Pretrained Models | 107.3 | 66.04 | 61.95 | 84.6 | 65.48 | 45.04 | 83.66 | 55.5 | LlamaForCausalLM |
| pic_7B_mistral_Full_v0.1 | Fine Tuned Models | 72.4 | 66.0 | 63.91 | 83.7 | 63.3 | 54.51 | 77.9 | 52.69 | Unknown |
| medllama-2-70b-qlora-1.1 | Fine Tuned Models | 700 | 65.99 | 69.03 | 87.17 | 71.04 | 52.41 | 84.21 | 32.07 | Unknown |
| model_51 | Fine Tuned Models | 687.2 | 65.96 | 68.43 | 86.71 | 69.31 | 57.18 | 81.77 | 32.37 | Unknown |
| FusionNet_passthrough | Fine Tuned Models | 212 | 65.94 | 69.45 | 87.72 | 65.28 | 67.65 | 81.29 | 24.26 | LlamaForCausalLM |
| Mixtral_7Bx2_MoE_13B_DPO | Fine Tuned Models | 128.8 | 65.89 | 65.44 | 84.01 | 62.14 | 61.76 | 78.45 | 43.52 | MixtralForCausalLM |
| Mixtral-4x7B-DPO-RPChat | Fine Tuned Models | 241.5 | 65.88 | 64.59 | 85.36 | 63.57 | 49.87 | 78.77 | 53.15 | MixtralForCausalLM |
| laser-polyglot-4x7b | Merged Models or MoE Models | 241.5 | 65.79 | 64.16 | 84.98 | 63.88 | 55.47 | 77.82 | 48.45 | MixtralForCausalLM |
| MetaMath-Mistral-7B | Fine Tuned Models | 70 | 65.78 | 60.67 | 82.58 | 61.95 | 44.89 | 75.77 | 68.84 | MistralForCausalLM |
| Maixtchup-4x7b | Fine Tuned Models | 241.5 | 65.77 | 62.54 | 83.83 | 61.28 | 56.13 | 76.01 | 54.81 | MixtralForCausalLM |
| model_420 | Fine Tuned Models | 687.2 | 65.76 | 70.14 | 87.73 | 70.35 | 54.0 | 83.74 | 28.58 | Unknown |
| FusionNet_passthrough_v0.1 | Fine Tuned Models | 212 | 65.74 | 69.45 | 87.79 | 65.2 | 67.67 | 81.53 | 22.82 | LlamaForCausalLM |
| Swallow-70b-instruct-hf | Fine Tuned Models | 691.6 | 65.74 | 66.21 | 85.14 | 67.08 | 48.0 | 82.08 | 45.94 | LlamaForCausalLM |
| Mistral-7B-Instruct-v0.2-attention-sparsity-20 | Chat Models | 72.4 | 65.74 | 62.88 | 84.84 | 60.81 | 68.26 | 77.9 | 39.73 | MistralForCausalLM |
| Moe-4x7b-math-reason-code | Merged Models or MoE Models | 241.5 | 65.73 | 62.54 | 83.87 | 61.2 | 56.12 | 76.09 | 54.59 | MixtralForCausalLM |
| Moe-4x7b-reason-code-qa | Fine Tuned Models | 241.5 | 65.73 | 62.54 | 83.87 | 61.2 | 56.12 | 76.09 | 54.59 | MixtralForCausalLM |
| dolphin-2.6-mistral-7b-dpo-orca-v2 | Fine Tuned Models | 72.4 | 65.72 | 66.13 | 84.9 | 62.64 | 62.39 | 78.61 | 39.65 | MistralForCausalLM |
| llama-2-70b-dolphin-peft | Fine Tuned Models | 700 | 65.72 | 69.62 | 86.82 | 69.18 | 57.43 | 83.9 | 27.37 | Unknown |
| Mistral-7B-Instruct-v0.2 | Chat Models | 72.4 | 65.71 | 63.14 | 84.88 | 60.78 | 68.26 | 77.19 | 40.03 | MistralForCausalLM |
| alooowso | Chat Models | 72.4 | 65.63 | 62.97 | 84.87 | 60.78 | 68.18 | 77.43 | 39.58 | Unknown |
| Mistral-7B-Instruct-v0.2-2x7B-MoE | Fine Tuned Models | 128.8 | 65.6 | 62.97 | 84.88 | 60.74 | 68.18 | 77.43 | 39.42 | MixtralForCausalLM |
| LLaMA-2-Jannie-70B-QLoRA | Fine Tuned Models | 700 | 65.6 | 68.94 | 86.9 | 69.37 | 53.67 | 82.95 | 31.77 | Unknown |
| Yee-34B-200K-Chat | Fine Tuned Models | 340 | 65.56 | 65.61 | 84.33 | 74.91 | 53.88 | 79.79 | 34.8 | Unknown |