加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| Mistral-7B-v0.2-meditron-turkish | Fine Tuned Models | 72.4 | 63.34 | 59.56 | 81.79 | 60.35 | 66.19 | 76.24 | 35.94 | MistralForCausalLM |
| aanaphi2-v0.1 | Chat Models | 27.8 | 63.28 | 63.91 | 77.97 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 57.73 |
| 51.56 |
| 73.64 |
| 54.89 |
| PhiForCausalLM |
| mistral-7B-forest-dpo | Fine Tuned Models | 70 | 63.28 | 65.02 | 86.31 | 63.05 | 55.43 | 79.56 | 30.33 | MistralForCausalLM |
| Gecko-7B-v0.1-DPO | Fine Tuned Models | 72.4 | 63.22 | 56.74 | 82.38 | 60.42 | 57.42 | 77.35 | 45.03 | MistralForCausalLM |
| flux-base-optimized | Fine Tuned Models | 72.4 | 63.22 | 65.44 | 81.74 | 59.74 | 50.02 | 77.74 | 44.66 | MistralForCausalLM |
| Aeryth-7B-v0.1 | Fine Tuned Models | 72.4 | 63.19 | 60.32 | 83.53 | 60.97 | 63.57 | 74.66 | 36.09 | MistralForCausalLM |
| DeciLM-7B-instruct | Chat Models | 70.4 | 63.19 | 61.01 | 82.37 | 60.24 | 49.75 | 79.72 | 46.02 | DeciLMForCausalLM |
| dm7b_sft_gpt88w_merge | Fine Tuned Models | 70 | 63.18 | 62.29 | 82.47 | 61.35 | 53.33 | 77.58 | 42.08 | MistralForCausalLM |
| kaori-34b-v3 | Fine Tuned Models | 343.9 | 63.18 | 64.25 | 79.59 | 70.18 | 52.37 | 76.48 | 36.24 | LlamaForCausalLM |
| Yi-34B-Chat | Chat Models | 343.9 | 63.17 | 65.1 | 84.08 | 74.87 | 55.41 | 79.79 | 19.79 | LlamaForCausalLM |
| Dionysus-Mistral-m3-v5 | Fine Tuned Models | 72.4 | 63.14 | 59.56 | 80.99 | 61.18 | 50.93 | 75.14 | 51.02 | MistralForCausalLM |
| testllm-c2 | Fine Tuned Models | 0 | 63.13 | 60.58 | 81.91 | 61.2 | 49.87 | 77.82 | 47.38 | MistralForCausalLM |
| flux-base-optimized | Fine Tuned Models | 72.4 | 63.12 | 65.53 | 81.76 | 59.84 | 50.03 | 77.35 | 44.2 | MistralForCausalLM |
| Qwen-14B-Llamafied | Pretrained Models | 140 | 63.09 | 55.2 | 82.31 | 66.11 | 45.6 | 76.56 | 52.77 | LlamaForCausalLM |
| PiVoT-MoE | Fine Tuned Models | 361 | 63.04 | 63.91 | 83.52 | 60.71 | 54.64 | 76.32 | 39.12 | MixtralForCausalLM |
| Airoboros-L2-70B-2.1-GPTQ | Fine Tuned Models | 728.2 | 63.04 | 70.39 | 86.54 | 68.89 | 55.55 | 81.61 | 15.24 | LlamaForCausalLM |
| blossom-v3-mistral-7b | Fine Tuned Models | 70 | 62.95 | 60.49 | 81.9 | 61.35 | 50.31 | 76.95 | 46.7 | MistralForCausalLM |
| Thespis-7b-v0.2-SFTTest-3Epoch | Fine Tuned Models | 72.4 | 62.94 | 63.23 | 84.39 | 62.59 | 53.9 | 77.51 | 36.01 | MistralForCausalLM |
| speechless-zephyr-code-functionary-7b | Fine Tuned Models | 72.4 | 62.93 | 61.52 | 83.88 | 64.71 | 44.99 | 78.69 | 43.82 | MistralForCausalLM |
| dolphin-2.6-mistral-7b-dpo-orca-v3 | Fine Tuned Models | 72.4 | 62.93 | 66.3 | 84.53 | 62.36 | 61.29 | 77.58 | 25.55 | MistralForCausalLM |
| CollectiveCognition-v1.1-Mistral-7B | Chat Models | 70 | 62.92 | 62.12 | 84.17 | 62.35 | 57.62 | 75.37 | 35.86 | MistralForCausalLM |
| zephyr-7b-sft-full-spin-iter1 | Fine Tuned Models | 72.4 | 62.86 | 65.87 | 85.44 | 60.95 | 57.39 | 76.64 | 30.86 | Unknown |
| test | Fine Tuned Models | 72.4 | 62.86 | 65.87 | 85.44 | 60.95 | 57.39 | 76.64 | 30.86 | Unknown |
| zephyr-7b-sft-full-SPIN-iter1 | Fine Tuned Models | 72.4 | 62.86 | 65.87 | 85.44 | 60.95 | 57.39 | 76.64 | 30.86 | MistralForCausalLM |
| germeo-7b-laser | Fine Tuned Models | 72.4 | 62.82 | 60.75 | 82.81 | 60.57 | 53.83 | 75.61 | 43.37 | MistralForCausalLM |
| Birbal-7B-V1 | Fine Tuned Models | 70 | 62.82 | 62.88 | 84.88 | 63.71 | 45.46 | 78.53 | 41.47 | Unknown |
| travel-mistral-7B-16b-base | Fine Tuned Models | 72.4 | 62.82 | 61.43 | 83.51 | 62.55 | 53.23 | 78.53 | 37.68 | MistralForCausalLM |
| openbuddy-mixtral-7bx8-v17.3-32k | Chat Models | 467.4 | 62.81 | 64.51 | 66.96 | 70.0 | 59.14 | 68.11 | 48.14 | MixtralForCausalLM |
| Multilingual-mistral | Merged Models or MoE Models | 467 | 62.79 | 62.29 | 81.76 | 61.38 | 55.53 | 75.53 | 40.26 | MixtralForCausalLM |
| llama-65b | Pretrained Models | 652.9 | 62.79 | 63.48 | 86.09 | 63.93 | 43.43 | 82.56 | 37.23 | LlamaForCausalLM |
| neural-chat-7b-v3-1-dare-0.85 | Fine Tuned Models | 70 | 62.74 | 61.95 | 83.84 | 64.43 | 44.9 | 79.16 | 42.15 | MistralForCausalLM |
| ARIA-70B-V3 | Fine Tuned Models | 689.8 | 62.73 | 63.91 | 86.21 | 64.75 | 51.32 | 82.08 | 28.13 | LlamaForCausalLM |
| SG-Raccoon-Yi-200k-2.0 | Chat Models | 555.9 | 62.72 | 62.54 | 80.26 | 73.29 | 53.21 | 76.32 | 30.71 | Unknown |
| Hercules-2.0-Mistral-7B | Chat Models | 72.4 | 62.69 | 61.09 | 83.69 | 63.47 | 43.97 | 79.48 | 44.43 | MistralForCausalLM |
| internlm2-base-20b-llama | Pretrained Models | 198.6 | 62.69 | 63.05 | 82.11 | 63.97 | 43.97 | 78.22 | 44.81 | LlamaForCausalLM |
| internlm2-base-20b-llama | Pretrained Models | 198.6 | 62.69 | 62.97 | 82.15 | 63.78 | 44.11 | 78.22 | 44.88 | LlamaForCausalLM |
| CodegebraGPT-10b | Fine Tuned Models | 100 | 62.68 | 59.81 | 83.42 | 60.2 | 46.57 | 80.98 | 45.11 | Unknown |
| guanaco-65B-HF | Fine Tuned Models | 650 | 62.67 | 65.44 | 86.47 | 62.92 | 52.81 | 82.4 | 26.0 | LlamaForCausalLM |
| zephyr-7b-dpo-qlora-no-sft | Chat Models | 70 | 62.67 | 62.46 | 84.5 | 64.02 | 44.25 | 79.16 | 41.62 | ? |
| Tess-XS-v1-3-yarn-128K | Fine Tuned Models | 0 | 62.66 | 61.09 | 82.95 | 62.15 | 50.13 | 74.43 | 45.19 | MistralForCausalLM |
| Metis-0.5 | Chat Models | 72.4 | 62.65 | 62.63 | 83.77 | 62.16 | 49.33 | 75.14 | 42.91 | MistralForCausalLM |
| Birbal-7B-V1 | Fine Tuned Models | 70 | 62.6 | 62.8 | 84.83 | 63.59 | 45.34 | 78.77 | 40.26 | Unknown |
| airoboros-l2-70b-gpt4-2.0 | Fine Tuned Models | 700 | 62.6 | 68.6 | 87.53 | 69.37 | 48.52 | 83.9 | 17.66 | LlamaForCausalLM |
| VicUnlocked-alpaca-65B-QLoRA-fp16 | Fine Tuned Models | 650 | 62.58 | 65.61 | 85.15 | 63.13 | 52.47 | 81.29 | 27.82 | LlamaForCausalLM |
| internlm2-chat-20b-llama | Chat Models | 198.6 | 62.56 | 63.65 | 82.58 | 66.89 | 48.74 | 79.56 | 33.97 | L;l;a;m;a;F;o;r;C;a;u;s;a;l;L;M |
| blossom-v3_1-mistral-7b | Fine Tuned Models | 70 | 62.53 | 60.49 | 81.71 | 61.0 | 49.51 | 75.53 | 46.93 | MistralForCausalLM |
| CodegebraGPT-10b | Fine Tuned Models | 100 | 62.53 | 59.56 | 83.45 | 60.07 | 46.53 | 81.06 | 44.5 | Unknown |
| Tess-XS-v1-3-yarn-128K | Fine Tuned Models | 0 | 62.49 | 61.6 | 82.96 | 62.1 | 50.2 | 74.74 | 43.37 | MistralForCausalLM |
| test_merged_model | Fine Tuned Models | 72.4 | 62.42 | 61.6 | 83.1 | 63.73 | 48.65 | 78.45 | 38.97 | MistralForCausalLM |
| Kant-Test-0.1-Mistral-7B | Fine Tuned Models | 72.4 | 62.42 | 62.37 | 82.84 | 63.38 | 49.62 | 78.3 | 37.98 | MistralForCausalLM |
| zephyr-7b-sft-full-SPIN-iter0 | Fine Tuned Models | 72.4 | 62.37 | 63.65 | 84.44 | 61.01 | 50.48 | 77.98 | 36.69 | MistralForCausalLM |
| test0 | Fine Tuned Models | 72.4 | 62.37 | 63.65 | 84.44 | 61.01 | 50.48 | 77.98 | 36.69 | Unknown |
| ARIA-70B-French | Fine Tuned Models | 700 | 62.37 | 64.51 | 85.87 | 63.88 | 52.8 | 80.51 | 26.69 | LlamaForCausalLM |
| Multirial | Merged Models or MoE Models | 467 | 62.37 | 63.23 | 79.57 | 61.01 | 54.7 | 75.3 | 40.41 | MixtralForCausalLM |
| karakuri-lm-70b-chat-v0.1 | Chat Models | 692 | 62.36 | 61.52 | 83.13 | 59.35 | 51.39 | 78.37 | 40.41 | LlamaForCausalLM |
| airoboros-65b-gpt4-1.2 | Fine Tuned Models | 650 | 62.36 | 65.87 | 86.08 | 63.37 | 52.72 | 79.56 | 26.54 | LlamaForCausalLM |
| zephyr-7b-alpha-dare-0.85 | Fine Tuned Models | 70 | 62.35 | 61.18 | 83.67 | 64.3 | 44.41 | 78.45 | 42.08 | MistralForCausalLM |
| zephyr-7b-sft-full-SPIN-iter0 | Fine Tuned Models | 72.4 | 62.32 | 63.57 | 84.43 | 61.28 | 50.34 | 77.98 | 36.32 | MistralForCausalLM |
| Mistral-7B-Discord-0.1-DPO | Fine Tuned Models | 72.4 | 62.29 | 63.23 | 83.27 | 62.62 | 55.28 | 78.93 | 30.4 | MistralForCausalLM |
| finetuned-Mistral-7B-Instruct-v0.2-5000-v2.0 | Fine Tuned Models | 72.4 | 62.27 | 59.3 | 82.65 | 58.45 | 59.54 | 77.66 | 36.01 | MistralForCausalLM |
| Tess-7B-v1.4 | Fine Tuned Models | 72.4 | 62.19 | 60.41 | 82.87 | 60.98 | 51.88 | 74.82 | 42.15 | MistralForCausalLM |
| Orca-2-13b-f16 | Fine Tuned Models | 130 | 62.14 | 60.67 | 79.81 | 60.37 | 56.41 | 76.64 | 38.97 | LlamaForCausalLM |
| lemur-70b-v1 | Fine Tuned Models | 700 | 62.07 | 64.33 | 85.72 | 65.85 | 44.78 | 83.03 | 28.73 | LlamaForCausalLM |
| speechless-mistral-dolphin-orca-platypus-samantha-7b-dare-0.85 | Fine Tuned Models | 70 | 62.06 | 61.69 | 83.85 | 64.43 | 43.13 | 78.93 | 40.33 | MistralForCausalLM |
| PlatYi-34B-200K-Q | Chat Models | 343.9 | 62.0 | 63.91 | 83.52 | 75.19 | 44.21 | 81.06 | 24.11 | LlamaForCausalLM |
| Iambe-20b-DARE-v2 | Chat Models | 199.9 | 61.99 | 62.8 | 84.53 | 60.45 | 53.85 | 77.03 | 33.28 | LlamaForCausalLM |
| zephyr-beta-math | Fine Tuned Models | 0 | 61.99 | 56.66 | 81.26 | 57.24 | 44.83 | 75.53 | 56.41 | MistralForCausalLM |
| Synthia-7B-v3.0 | Fine Tuned Models | 72.4 | 61.99 | 62.46 | 83.79 | 63.9 | 43.85 | 77.9 | 40.03 | MistralForCausalLM |
| Orca-2-13b | Fine Tuned Models | 130 | 61.98 | 60.92 | 79.85 | 60.3 | 56.42 | 76.56 | 37.83 | LlamaForCausalLM |
| zephyr-7b-beta | Fine Tuned Models | 72.4 | 61.95 | 62.03 | 84.36 | 61.07 | 57.45 | 77.74 | 29.04 | MistralForCausalLM |
| zephyr-7b-truthy | Chat Models | 72.4 | 61.93 | 60.75 | 84.64 | 59.53 | 63.31 | 77.9 | 25.47 | MistralForCausalLM |
| ARIA-70B-V2 | Fine Tuned Models | 700 | 61.93 | 62.12 | 85.68 | 63.49 | 49.8 | 81.69 | 28.81 | LlamaForCausalLM |
| dolphin-2.6-mistral-7b-dpo-orca | Fine Tuned Models | 72.4 | 61.92 | 66.04 | 84.62 | 62.28 | 59.97 | 78.3 | 20.32 | MistralForCausalLM |
| dolphin-2.6-mistral-7b-dpo-orca-v1 | Fine Tuned Models | 72.4 | 61.92 | 66.04 | 84.62 | 62.28 | 59.97 | 78.3 | 20.32 | MistralForCausalLM |
| Mistral-Syndicate-7B | Fine Tuned Models | 72.4 | 61.9 | 60.84 | 82.91 | 60.83 | 43.71 | 78.61 | 44.5 | MistralForCausalLM |
| uniwiz-7B-v0.1 | Fine Tuned Models | 72.4 | 61.87 | 61.77 | 84.16 | 64.16 | 44.96 | 78.85 | 37.3 | MistralForCausalLM |
| Winterreise-m7 | Fine Tuned Models | 0 | 61.86 | 61.26 | 83.84 | 63.85 | 45.55 | 79.08 | 37.6 | MistralForCausalLM |
| Noromaid-7b-v0.2 | Fine Tuned Models | 70 | 61.86 | 62.12 | 84.92 | 63.1 | 46.09 | 78.22 | 36.69 | MistralForCausalLM |
| bagel-8x7b-v0.2 | Fine Tuned Models | 467 | 61.83 | 68.26 | 86.32 | 70.4 | 60.03 | 81.29 | 4.7 | MixtralForCausalLM |
| Mistral-7B-Instruct-v0.2 | Chat Models | 70 | 61.79 | 60.15 | 82.79 | 60.07 | 56.06 | 76.87 | 34.8 | ? |
| Noromaid-7b-v0.2 | Fine Tuned Models | 70 | 61.78 | 62.03 | 84.97 | 62.99 | 46.07 | 78.37 | 36.24 | MistralForCausalLM |
| test | Fine Tuned Models | 0 | 61.76 | 62.29 | 84.42 | 61.07 | 57.51 | 78.06 | 27.22 | Unknown |
| test_model | Fine Tuned Models | 0 | 61.76 | 62.29 | 84.42 | 61.07 | 57.51 | 78.06 | 27.22 | Unknown |
| Mistral-Syndicate-7B | Fine Tuned Models | 72.4 | 61.74 | 60.84 | 82.88 | 60.52 | 43.73 | 78.45 | 44.05 | MistralForCausalLM |
| juud-Mistral-7B | Chat Models | 72.4 | 61.72 | 66.72 | 85.0 | 63.38 | 54.12 | 77.98 | 23.12 | MistralForCausalLM |
| Metabird-7b-DPO | Fine Tuned Models | 72.4 | 61.7 | 65.96 | 86.29 | 64.46 | 60.3 | 81.37 | 11.83 | MistralForCausalLM |
| MythoMist-7b | Fine Tuned Models | 72.4 | 61.67 | 65.87 | 83.55 | 62.32 | 59.98 | 78.06 | 20.24 | MistralForCausalLM |
| Mistral-7B-OpenOrca-oasst_top1_2023-08-25-v2 | Fine Tuned Models | 70 | 61.65 | 60.49 | 82.07 | 62.34 | 46.38 | 78.45 | 40.18 | MistralForCausalLM |
| rainbowfish-v6 | Chat Models | 72.4 | 61.64 | 61.95 | 82.51 | 62.79 | 48.37 | 77.9 | 36.32 | MistralForCausalLM |
| Qwen-14B-Chat-LLaMAfied | Fine Tuned Models | 141.7 | 61.6 | 57.51 | 82.11 | 65.57 | 51.99 | 72.93 | 39.5 | LlamaForCausalLM |
| Llamix2-MLewd-4x13B | Fine Tuned Models | 385 | 61.6 | 61.01 | 83.17 | 56.32 | 50.35 | 75.37 | 43.37 | MixtralForCausalLM |
| neural-chat-7b-v3-1 | Fine Tuned Models | 72.4 | 61.59 | 66.21 | 83.64 | 62.37 | 59.65 | 78.14 | 19.56 | MistralForCausalLM |
| neural-chat-7b-v3-1 | Fine Tuned Models | 72.4 | 61.59 | 65.7 | 83.54 | 62.12 | 59.48 | 78.61 | 20.09 | MistralForCausalLM |
| NSFW_DPO_Noromaid-7b | Fine Tuned Models | 72.4 | 61.59 | 62.63 | 84.5 | 63.34 | 44.99 | 78.22 | 35.86 | MistralForCausalLM |
| zephyr-7b-beta | Fine Tuned Models | 72.4 | 61.59 | 62.46 | 84.35 | 60.7 | 57.83 | 77.11 | 27.07 | MistralForCausalLM |
| DeciLM-7B | Pretrained Models | 70.4 | 61.55 | 59.39 | 82.51 | 59.76 | 40.33 | 79.95 | 47.38 | DeciLMForCausalLM |
| zephyr-7b-dpo-full-beta-0.2 | Chat Models | 72.4 | 61.55 | 61.77 | 84.04 | 61.79 | 54.72 | 76.95 | 30.02 | MistralForCausalLM |
| neural-chat-7b-v3-1 | Fine Tuned Models | 72.4 | 61.54 | 66.3 | 83.6 | 62.44 | 59.54 | 77.98 | 19.41 | MistralForCausalLM |
| OpenHermes-2.5-Mistral-7B | Fine Tuned Models | 72.4 | 61.52 | 64.93 | 84.18 | 63.64 | 52.24 | 78.06 | 26.08 | MistralForCausalLM |
| Mistral-7B-OpenOrca-lora-merged | Fine Tuned Models | 70 | 61.52 | 61.77 | 83.61 | 64.34 | 42.7 | 78.53 | 38.13 | ? |