加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| Noromaid-7b-v0.1.1 | Fine Tuned Models | 72.4 | 61.49 | 62.2 | 84.28 | 63.44 | 44.3 | 77.9 | 36.85 | MistralForCausalLM |
| robin-65b-v2-fp16 | Fine Tuned Models | 650 | 61.48 | 61.95 | 84.6 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 62.51 |
| 52.31 |
| 80.51 |
| 26.99 |
| LlamaForCausalLM |
| Mistral-7B-v0.1-DPO | Fine Tuned Models | 72.4 | 61.47 | 61.26 | 83.94 | 63.76 | 42.68 | 78.77 | 38.44 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B | Fine Tuned Models | 72.4 | 61.45 | 64.93 | 84.3 | 63.82 | 52.31 | 77.9 | 25.47 | MistralForCausalLM |
| speechless-llama2-13b | Fine Tuned Models | 130.2 | 61.36 | 62.03 | 81.82 | 58.69 | 55.66 | 76.01 | 33.97 | LlamaForCausalLM |
| zephyr-7b-dpo-full-beta-0.2 | Chat Models | 72.4 | 61.36 | 61.86 | 83.98 | 61.85 | 54.78 | 76.95 | 28.73 | MistralForCausalLM |
| Metis-0.3-merged | Chat Models | 72.4 | 61.34 | 62.2 | 84.0 | 62.65 | 59.24 | 78.14 | 21.83 | Unknown |
| alpaca-lora-65B-HF | Fine Tuned Models | 650 | 61.33 | 64.85 | 85.59 | 63.11 | 45.15 | 81.22 | 28.05 | LlamaForCausalLM |
| phi-2 | Pretrained Models | 27.8 | 61.33 | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | PhiForCausalLM |
| Mistral-7B-v0.1-DPO | Chat Models | 72.4 | 61.3 | 60.32 | 83.69 | 64.01 | 43.53 | 79.01 | 37.23 | MistralForCausalLM |
| Deacon-20B | Chat Models | 200.9 | 61.28 | 60.75 | 81.74 | 60.7 | 58.49 | 76.8 | 29.19 | LlamaForCausalLM |
| Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v3 | Fine Tuned Models | 70 | 61.26 | 60.58 | 83.34 | 61.53 | 48.21 | 77.74 | 36.16 | MistralForCausalLM |
| dpo-phi2 | Chat Models | 27.8 | 61.26 | 61.69 | 75.13 | 58.1 | 43.99 | 74.19 | 54.44 | PhiForCausalLM |
| Phi-2-DPO | Fine Tuned Models | 27.8 | 61.25 | 60.75 | 75.03 | 57.75 | 44.46 | 73.64 | 55.88 | PhiForCausalLM |
| WizardLM-70B-V1.0 | Chat Models | 700 | 61.25 | 65.44 | 84.41 | 64.05 | 54.81 | 80.82 | 17.97 | LlamaForCausalLM |
| openchat_3.5 | Fine Tuned Models | 0 | 61.24 | 63.91 | 84.79 | 64.94 | 46.38 | 80.58 | 26.84 | MistralForCausalLM |
| Mini_DPO_test02 | Chat Models | 72.4 | 61.23 | 59.73 | 83.89 | 61.9 | 48.47 | 78.37 | 35.03 | MistralForCausalLM |
| openchat_3.5 | Fine Tuned Models | 0 | 61.22 | 63.82 | 84.8 | 64.98 | 46.39 | 80.74 | 26.61 | MistralForCausalLM |
| Mistral-Instruct-7B-v0.2-ChatAlpaca | Chat Models | 70 | 61.21 | 56.74 | 80.82 | 59.1 | 55.86 | 77.11 | 37.6 | ? |
| airoboros-65b-gpt4-2.0 | Fine Tuned Models | 650 | 61.2 | 66.64 | 86.66 | 63.18 | 49.11 | 80.74 | 20.85 | LlamaForCausalLM |
| mistral-7b-slimorcaboros | Fine Tuned Models | 70 | 61.18 | 63.65 | 83.7 | 63.46 | 55.81 | 77.03 | 23.43 | MistralForCausalLM |
| speechless-mistral-hermes-code-7b | Fine Tuned Models | 72.4 | 61.16 | 59.39 | 78.55 | 59.88 | 51.26 | 77.27 | 40.64 | MistralForCausalLM |
| jackalope-7b | Fine Tuned Models | 70 | 61.16 | 63.4 | 83.29 | 63.5 | 50.06 | 78.06 | 28.66 | MistralForCausalLM |
| mistral-7b-ft-h4-no_robots_instructions | Chat Models | 72.4 | 61.16 | 60.92 | 83.17 | 63.37 | 43.63 | 78.85 | 37.0 | MistralForCausalLM |
| mistral-7b-ft-h4-no_robots_instructions | Chat Models | 72.4 | 61.16 | 60.92 | 83.24 | 63.74 | 43.64 | 78.69 | 36.69 | MistralForCausalLM |
| DPO_mistral_7b_alpaca_0124_v1 | Fine Tuned Models | 72.4 | 61.15 | 63.4 | 73.2 | 60.51 | 66.76 | 77.19 | 25.85 | MistralForCausalLM |
| PlatYi-34B-Llama-Q-v3 | Chat Models | 343.9 | 61.15 | 64.33 | 84.88 | 74.98 | 51.8 | 84.21 | 6.67 | LlamaForCausalLM |
| airoboros-65b-gpt4-2.0 | Fine Tuned Models | 650 | 61.14 | 66.81 | 86.66 | 63.41 | 49.17 | 80.27 | 20.55 | LlamaForCausalLM |
| dolphin-2.1-mistral-7b | Chat Models | 71.1 | 61.12 | 64.42 | 84.92 | 63.32 | 55.56 | 77.74 | 20.77 | Unknown |
| Kant-Test-0.1-Mistral-7B | Fine Tuned Models | 72.4 | 61.1 | 61.77 | 82.89 | 62.86 | 49.4 | 78.53 | 31.16 | MistralForCausalLM |
| ZySec-7B-v1 | Fine Tuned Models | 72.4 | 61.08 | 63.48 | 85.01 | 60.14 | 56.49 | 78.14 | 23.2 | MistralForCausalLM |
| Dans-07YahooAnswers-7b | Unkown Model Types | 70 | 61.07 | 61.52 | 83.69 | 63.52 | 41.84 | 78.53 | 37.3 | MistralForCausalLM |
| AISquare-Instruct-SOLAR-10.7b-v0.5.31 | Chat Models | 107 | 61.05 | 60.67 | 84.2 | 52.86 | 51.35 | 82.95 | 34.27 | LlamaForCausalLM |
| dolphin-2.1-mistral-7b | Fine Tuned Models | 71.1 | 61.0 | 63.99 | 85.0 | 63.44 | 55.57 | 77.9 | 20.09 | Unknown |
| Llama2_init_Mistral | Merged Models or MoE Models | 72.4 | 60.98 | 60.07 | 83.3 | 64.09 | 42.15 | 78.37 | 37.91 | LlamaForCausalLM |
| Mistral-7B-v0.1 | Pretrained Models | 72.4 | 60.97 | 59.98 | 83.31 | 64.16 | 42.15 | 78.37 | 37.83 | MistralForCausalLM |
| Mistral-7B-v0.1 | Pretrained Models | 72.4 | 60.97 | 59.98 | 83.31 | 64.16 | 42.15 | 78.37 | 37.83 | MistralForCausalLM |
| Merak-7B-v5-PROTOTYPE1 | Fine Tuned Models | 70 | 60.96 | 62.2 | 82.07 | 60.97 | 45.41 | 77.9 | 37.23 | MistralForCausalLM |
| Pallas-0.5-frankenmerge | Fine Tuned Models | 360.6 | 60.95 | 61.77 | 80.36 | 67.62 | 54.07 | 77.74 | 24.11 | LlamaForCausalLM |
| openbuddy-falcon-40b-v16.1-4k | Fine Tuned Models | 413.5 | 60.94 | 60.58 | 83.86 | 56.05 | 50.57 | 77.82 | 36.77 | FalconForCausalLM |
| Mistral-7B-LoreWeaver | Fine Tuned Models | 72.4 | 60.93 | 59.98 | 83.29 | 64.12 | 42.15 | 78.37 | 37.68 | ? |
| speechless-mistral-moloras-7b | Fine Tuned Models | 72.4 | 60.93 | 59.98 | 83.29 | 64.12 | 42.15 | 78.37 | 37.68 | MistralForCausalLM |
| mistral-sft-v3 | Pretrained Models | 72.4 | 60.93 | 61.35 | 82.23 | 63.4 | 48.49 | 77.66 | 32.45 | MistralForCausalLM |
| openbuddy-mistral-7b-v17.1-32k | Fine Tuned Models | 72.8 | 60.92 | 55.55 | 77.95 | 58.29 | 56.06 | 74.98 | 42.68 | MistralForCausalLM |
| openchat-3.5-0106-11b | Fine Tuned Models | 107.3 | 60.91 | 63.65 | 78.64 | 62.54 | 48.07 | 78.06 | 34.5 | MistralForCausalLM |
| alpaca-lora-65b-en-pt-es-ca | Fine Tuned Models | 650 | 60.89 | 65.02 | 84.88 | 62.19 | 46.06 | 80.51 | 26.69 | Unknown |
| juud-Mistral-7B-dpo | Chat Models | 72.4 | 60.89 | 66.81 | 84.89 | 63.03 | 53.51 | 78.3 | 18.8 | MistralForCausalLM |
| SlimOpenOrca-Mistral-7B | Fine Tuned Models | 72.4 | 60.84 | 62.97 | 83.49 | 62.3 | 57.39 | 77.43 | 21.46 | MistralForCausalLM |
| Einstein-7B | Fine Tuned Models | 70 | 60.81 | 61.6 | 84.35 | 62.87 | 42.55 | 77.51 | 36.01 | MistralForCausalLM |
| speechless-mistral-dolphin-orca-platypus-samantha-7b | Fine Tuned Models | 72.4 | 60.79 | 64.33 | 84.4 | 63.72 | 52.52 | 78.37 | 21.38 | MistralForCausalLM |
| airoboros-65b-gpt4-m2.0 | Fine Tuned Models | 650 | 60.79 | 65.02 | 86.35 | 64.37 | 46.66 | 80.19 | 22.14 | LlamaForCausalLM |
| openbuddy-llama-30b-v7.1-bf16 | Fine Tuned Models | 323.5 | 60.76 | 62.37 | 82.29 | 58.18 | 52.6 | 77.51 | 31.61 | Unknown |
| speechless-mistral-six-in-one-7b | Fine Tuned Models | 72.4 | 60.76 | 62.97 | 84.6 | 63.29 | 57.77 | 77.51 | 18.42 | MistralForCausalLM |
| oasst-rlhf-2-llama-30b-7k-steps-hf | Chat Models | 300 | 60.74 | 61.35 | 83.8 | 57.89 | 51.18 | 78.77 | 31.46 | LlamaForCausalLM |
| Venomia-m7 | Fine Tuned Models | 72.4 | 60.74 | 63.14 | 84.0 | 60.06 | 49.08 | 75.77 | 32.37 | MistralForCausalLM |
| openbuddy-llama-30b-v7.1-bf16 | Fine Tuned Models | 323.5 | 60.71 | 62.46 | 82.3 | 58.15 | 52.57 | 77.82 | 30.93 | Unknown |
| Nanbeige-16B-Base-Llama | Pretrained Models | 158.3 | 60.7 | 56.48 | 78.97 | 63.34 | 42.6 | 75.77 | 47.01 | LlamaForCausalLM |
| openbuddy-mistral-7b-v17.1-32k | Chat Models | 72.8 | 60.69 | 55.38 | 78.0 | 58.08 | 56.07 | 75.22 | 41.39 | MistralForCausalLM |
| airoboros-65b-gpt4-m2.0 | Fine Tuned Models | 650 | 60.68 | 65.1 | 86.34 | 64.32 | 46.63 | 80.11 | 21.61 | LlamaForCausalLM |
| airoboros-65b-gpt4-1.4 | Unkown Model Types | 650 | 60.67 | 65.78 | 85.83 | 62.27 | 52.45 | 79.64 | 18.04 | LlamaForCausalLM |
| airoboros-65b-gpt4-1.4-peft | Fine Tuned Models | 650 | 60.67 | 65.78 | 85.83 | 62.27 | 52.45 | 79.64 | 18.04 | Unknown |
| Zephyrus-L1-33B | Fine Tuned Models | 325.3 | 60.61 | 64.51 | 84.15 | 57.37 | 53.87 | 80.19 | 23.58 | LlamaForCausalLM |
| airoboros-65b-gpt4-1.4 | Fine Tuned Models | 650 | 60.59 | 65.53 | 85.77 | 61.95 | 52.43 | 79.79 | 18.04 | LlamaForCausalLM |
| Influxient-4x13B | Merged Models or MoE Models | 385 | 60.57 | 61.26 | 83.42 | 57.25 | 54.1 | 74.35 | 33.06 | MixtralForCausalLM |
| Synatra-7B-v0.3-dpo | Chat Models | 70 | 60.55 | 62.8 | 82.58 | 61.46 | 56.46 | 76.24 | 23.73 | MistralForCausalLM |
| dolphin-2.2.1-mistral-7b | Fine Tuned Models | 70 | 60.54 | 63.48 | 83.86 | 63.28 | 53.17 | 78.37 | 21.08 | Unknown |
| kalomaze-stuff | Fine Tuned Models | 0 | 60.53 | 59.64 | 83.55 | 63.41 | 41.64 | 78.61 | 36.32 | Unknown |
| Mistral-11B-TestBench9 | Fine Tuned Models | 107.3 | 60.52 | 64.08 | 84.24 | 64.0 | 56.19 | 78.45 | 16.15 | Unknown |
| WizardLM-70B-V1.0-GPTQ | Fine Tuned Models | 728.2 | 60.5 | 63.82 | 83.85 | 63.68 | 54.54 | 78.61 | 18.5 | LlamaForCausalLM |
| Damysus-2.7B-Chat | Chat Models | 27.8 | 60.49 | 59.81 | 74.52 | 56.33 | 46.74 | 74.9 | 50.64 | PhiForCausalLM |
| traversaal-2.5-Mistral-7B | Chat Models | 72.4 | 60.48 | 66.21 | 85.02 | 63.24 | 54.0 | 77.9 | 16.53 | MistralForCausalLM |
| Dolphin2.1-OpenOrca-7B | Fine Tuned Models | 72.4 | 60.47 | 63.91 | 84.26 | 62.66 | 53.84 | 78.22 | 19.94 | MistralForCausalLM |
| Stellaris-internlm2-20b-r512 | Fine Tuned Models | 200 | 60.46 | 63.82 | 84.0 | 66.34 | 49.51 | 84.45 | 14.63 | LlamaForCausalLM |
| Instruct_Mistral-7B-v0.1_Dolly15K | Fine Tuned Models | 72.4 | 60.45 | 59.39 | 82.62 | 62.71 | 43.56 | 79.32 | 35.1 | MistralForCausalLM |
| WizardMath-70B-V1.0 | Chat Models | 700 | 60.42 | 68.17 | 86.49 | 68.89 | 52.69 | 82.32 | 3.94 | LlamaForCausalLM |
| WizardMath-70B-V1.0 | Chat Models | 700 | 60.41 | 67.92 | 86.46 | 68.92 | 52.77 | 82.32 | 4.09 | LlamaForCausalLM |
| Xenon-4 | Chat Models | 72.4 | 60.39 | 60.15 | 83.07 | 60.08 | 61.31 | 77.03 | 20.7 | MistralForCausalLM |
| SlimOrca-13B | Fine Tuned Models | 130 | 60.39 | 60.15 | 81.4 | 57.04 | 49.37 | 74.43 | 39.95 | LlamaForCausalLM |
| speechless-mistral-7b-dare-0.85 | Fine Tuned Models | 72.4 | 60.39 | 63.31 | 84.93 | 64.22 | 50.68 | 79.32 | 19.86 | MistralForCausalLM |
| Dolphin2.1-OpenOrca-7B | Fine Tuned Models | 72.4 | 60.38 | 64.16 | 84.25 | 62.7 | 53.83 | 77.66 | 19.71 | MistralForCausalLM |
| phi-2-openhermes-30k | Fine Tuned Models | 0 | 60.37 | 61.01 | 74.72 | 57.17 | 45.38 | 74.9 | 49.05 | PhiForCausalLM |
| Mistral-7B-SlimOrca | Fine Tuned Models | 70 | 60.37 | 62.54 | 83.86 | 62.77 | 54.23 | 77.43 | 21.38 | MistralForCausalLM |
| Kesehatan-7B-v0.1 | Fine Tuned Models | 72.4 | 60.37 | 60.32 | 82.54 | 59.94 | 50.68 | 76.48 | 32.22 | MistralForCausalLM |
| Mistral-7B-Discord-0.1 | Fine Tuned Models | 72.4 | 60.28 | 60.24 | 83.13 | 62.82 | 44.1 | 78.93 | 32.45 | MistralForCausalLM |
| Xenon-3 | Chat Models | 72.4 | 60.27 | 58.87 | 83.39 | 59.79 | 61.99 | 77.51 | 20.09 | MistralForCausalLM |
| Psyfighter2-Noromaid-ties-Capybara-13B | Fine Tuned Models | 130.2 | 60.27 | 62.29 | 83.87 | 56.59 | 51.44 | 77.03 | 30.4 | LlamaForCausalLM |
| openchat_3.5 | Fine Tuned Models | 0 | 60.26 | 62.46 | 83.96 | 62.89 | 45.43 | 81.06 | 25.78 | MistralForCausalLM |
| SlimOpenOrca-Mistral-7B-v2 | Fine Tuned Models | 72.4 | 60.25 | 62.88 | 83.41 | 62.05 | 56.65 | 77.58 | 18.95 | Unknown |
| Mistral-11B-TestBench11 | Fine Tuned Models | 107.3 | 60.25 | 64.42 | 83.93 | 63.82 | 56.68 | 77.74 | 14.94 | Unknown |
| Damysus-2.7B-Chat | Chat Models | 27.8 | 60.25 | 59.13 | 74.36 | 56.34 | 46.45 | 75.06 | 50.19 | PhiForCausalLM |
| wendigo-14b-alpha4 | Fine Tuned Models | 142.2 | 60.25 | 59.3 | 79.65 | 59.85 | 54.98 | 74.74 | 32.98 | MistralForCausalLM |
| smartyplats-7b-v2 | Chat Models | 70 | 60.24 | 57.94 | 80.76 | 58.16 | 50.26 | 75.53 | 38.82 | MistralForCausalLM |
| GPlatty-30B | Fine Tuned Models | 323.2 | 60.23 | 65.78 | 84.79 | 63.49 | 52.45 | 80.98 | 13.87 | Unknown |
| notus-7b-v1 | Fine Tuned Models | 72.4 | 60.22 | 64.59 | 84.78 | 63.03 | 54.37 | 79.4 | 15.16 | MistralForCausalLM |
| Tenebra_30B_Alpha01_FP16 | Fine Tuned Models | 325.3 | 60.18 | 64.51 | 84.79 | 54.29 | 54.22 | 78.61 | 24.64 | LlamaForCausalLM |
| Mistral-7B-OpenOrca | Fine Tuned Models | 70 | 60.17 | 64.08 | 83.99 | 62.24 | 53.05 | 77.74 | 19.94 | MistralForCausalLM |
| airoboros-65b-gpt4-1.3 | Fine Tuned Models | 650 | 60.15 | 66.13 | 85.99 | 63.89 | 51.32 | 79.95 | 13.65 | LlamaForCausalLM |
| firefly-zephyr-6x7b-lora | Fine Tuned Models | 70 | 60.13 | 61.01 | 82.8 | 60.09 | 48.84 | 77.03 | 31.01 | Unknown |
| wendigo-14b-alpha3 | Fine Tuned Models | 142.2 | 60.1 | 59.39 | 79.51 | 59.72 | 55.12 | 74.74 | 32.15 | Unknown |
| zephyr-python-ru-merged | Fine Tuned Models | 72.4 | 60.1 | 56.06 | 82.06 | 60.2 | 52.81 | 76.95 | 32.52 | MistralForCausalLM |