加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| Athena-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 53.16 | 56.66 | 80.56 | 55.43 | 53.62 | 72.61 | 0.08 | Unknown |
| Airboros2.1-Platypus2-13B-QLora-0.80-epoch | Chat Models | 130.2 | 53.15 | 58.96 | 82.46 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 54.62 |
| 47.71 |
| 75.14 |
| 0.0 |
| Unknown |
| Llama-2-13b-hf-ds_wiki_1024_full_r_64_alpha_16 | Chat Models | 130 | 53.14 | 59.04 | 82.33 | 55.36 | 35.75 | 76.32 | 10.01 | Unknown |
| LlongOrca-7B-16k | Chat Models | 70 | 53.02 | 57.51 | 79.44 | 49.35 | 49.84 | 74.51 | 7.51 | LlamaForCausalLM |
| Walter-Mistral-7B | Chat Models | 72.4 | 53.0 | 58.87 | 83.43 | 58.65 | 39.93 | 77.03 | 0.08 | MistralForCausalLM |
| Llama-2-7b-chat-hf-afr-100step-flan-v2 | Chat Models | 70 | 52.92 | 53.24 | 78.43 | 48.43 | 45.66 | 72.3 | 19.48 | LlamaForCausalLM |
| Llama-2-13b-ft-instruct-es | Chat Models | 130 | 52.89 | 59.39 | 81.51 | 54.31 | 37.81 | 75.77 | 8.57 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-afr-100step-flan | Chat Models | 70 | 52.88 | 52.9 | 78.44 | 48.4 | 45.67 | 72.38 | 19.48 | LlamaForCausalLM |
| archangel_sft-kto_llama13b | Chat Models | 130.2 | 52.87 | 56.14 | 80.8 | 47.84 | 39.42 | 76.16 | 16.83 | LlamaForCausalLM |
| Telugu-Llama2-7B-v0-Instruct | Chat Models | 70 | 52.86 | 53.58 | 78.33 | 47.63 | 43.26 | 73.95 | 20.39 | LlamaForCausalLM |
| japanese-stablelm-instruct-gamma-7b | Chat Models | 72.4 | 52.82 | 50.68 | 78.68 | 54.82 | 39.77 | 73.72 | 19.26 | MistralForCausalLM |
| ypotryll-22b-epoch2-qlora | Chat Models | 220 | 52.75 | 59.22 | 80.66 | 54.52 | 40.42 | 76.32 | 5.38 | Unknown |
| Llama-2-7b-chat-hf-afr-200step-flan-v2 | Chat Models | 70 | 52.75 | 52.65 | 78.04 | 48.51 | 45.42 | 72.93 | 18.95 | LlamaForCausalLM |
| yehoon_llama2 | Chat Models | 0 | 52.71 | 54.78 | 78.98 | 51.29 | 49.17 | 74.74 | 7.28 | Unknown |
| Mistral-Trismegistus-7B | Chat Models | 70 | 52.66 | 54.1 | 77.91 | 54.49 | 49.36 | 70.17 | 9.93 | MistralForCausalLM |
| Llama-2-7b-chat-hf-afr-200step-flan | Chat Models | 70 | 52.62 | 52.47 | 78.02 | 48.42 | 45.47 | 72.69 | 18.65 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-10-attention-sparsity | Chat Models | 67.4 | 52.52 | 52.9 | 78.18 | 48.1 | 45.4 | 71.43 | 19.11 | LlamaForCausalLM |
| Mistral-7B-golden | Chat Models | 70 | 52.49 | 60.75 | 44.42 | 59.29 | 53.51 | 76.64 | 20.32 | MistralForCausalLM |
| Llama-2-7b-chat-hf-10-sparsity | Chat Models | 67.4 | 52.48 | 53.16 | 78.26 | 48.18 | 45.29 | 71.59 | 18.42 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-afr-300step-flan-v2 | Chat Models | 70 | 52.41 | 52.56 | 77.76 | 48.51 | 45.14 | 72.53 | 17.97 | LlamaForCausalLM |
| PuddleJumper-Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130.2 | 52.41 | 54.52 | 79.36 | 55.15 | 54.32 | 71.11 | 0.0 | Unknown |
| TowerInstruct-7B-v0.1 | Chat Models | 67.4 | 52.39 | 55.46 | 79.0 | 46.88 | 42.59 | 73.95 | 16.45 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-afr-441step-flan-v2 | Chat Models | 70 | 52.28 | 52.13 | 77.63 | 48.52 | 45.02 | 72.53 | 17.82 | LlamaForCausalLM |
| Platypus2-13B-QLoRA-0.80-epoch | Chat Models | 130 | 52.27 | 57.76 | 81.63 | 55.63 | 39.7 | 75.93 | 2.96 | Unknown |
| Llama-2-7b-chat-hf-afr-200step-merged | Chat Models | 70 | 52.26 | 52.05 | 77.38 | 48.65 | 44.6 | 71.9 | 18.95 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-20-attention-sparsity | Chat Models | 67.4 | 52.19 | 53.41 | 77.91 | 47.49 | 45.84 | 70.72 | 17.74 | LlamaForCausalLM |
| mc_data_30k_from_platpus_orca_7b_10k_v1_lora_qkvo_rank14_v2 | Chat Models | 70 | 52.13 | 57.17 | 79.57 | 50.24 | 52.51 | 72.93 | 0.38 | Unknown |
| Llama-2-7b-chat-hf-20-sparsity | Chat Models | 70 | 52.01 | 52.47 | 77.91 | 47.27 | 45.88 | 70.72 | 17.82 | LlamaForCausalLM |
| Yi-6b-200k-dpo | Chat Models | 60.6 | 51.93 | 43.09 | 74.53 | 64.0 | 45.51 | 73.09 | 11.37 | LlamaForCausalLM |
| Yi-7b-dpo | Chat Models | 60.6 | 51.93 | 43.09 | 74.53 | 64.0 | 45.51 | 73.09 | 11.37 | Unknown |
| Llama-2-7b-chat-hf-30-attention-sparsity | Chat Models | 67.4 | 51.8 | 53.41 | 76.87 | 47.04 | 45.02 | 71.03 | 17.44 | LlamaForCausalLM |
| blossom-v2-llama2-7b | Chat Models | 70 | 51.71 | 54.1 | 78.57 | 51.66 | 46.84 | 74.35 | 4.78 | LlamaForCausalLM |
| LosslessMegaCoder-llama2-7b-mini | Chat Models | 70 | 51.66 | 53.5 | 77.38 | 49.72 | 45.77 | 74.03 | 9.55 | LlamaForCausalLM |
| stable-vicuna-13B-HF | Chat Models | 130 | 51.64 | 53.33 | 78.5 | 50.29 | 48.38 | 75.22 | 4.09 | LlamaForCausalLM |
| tamil-llama-13b-instruct-v0.1 | Chat Models | 130 | 51.59 | 54.52 | 79.35 | 50.37 | 41.22 | 76.56 | 7.51 | LlamaForCausalLM |
| airoboros-c34b-2.1 | Chat Models | 340 | 51.52 | 54.69 | 76.45 | 55.08 | 46.15 | 68.43 | 8.34 | LlamaForCausalLM |
| OpenHermes-7B | Chat Models | 70 | 51.26 | 56.14 | 78.32 | 48.62 | 45.0 | 74.51 | 5.0 | LlamaForCausalLM |
| airoboros-l2-7b-2.2.1 | Chat Models | 70 | 51.22 | 55.03 | 80.06 | 47.64 | 44.65 | 73.8 | 6.14 | LlamaForCausalLM |
| vicuna-7b-v1.3-attention-sparsity-10 | Chat Models | 67.4 | 51.13 | 52.22 | 77.05 | 47.93 | 46.87 | 69.53 | 13.19 | LlamaForCausalLM |
| Samantha-1.11-7b | Chat Models | 66.1 | 51.07 | 55.03 | 79.12 | 40.51 | 50.37 | 74.19 | 7.2 | Unknown |
| Llama-2-7b-chat-hf-30-sparsity | Chat Models | 67.4 | 51.02 | 52.47 | 76.58 | 45.57 | 44.82 | 69.61 | 17.06 | LlamaForCausalLM |
| LLaMa-2-PeanutButter_v18_B-7B | Chat Models | 70 | 50.94 | 54.61 | 81.0 | 47.07 | 41.93 | 74.51 | 6.52 | Unknown |
| Llama-2-7b-chat-hf-afr-100step-v2 | Chat Models | 70 | 50.89 | 52.65 | 78.25 | 48.47 | 45.18 | 72.3 | 8.49 | LlamaForCausalLM |
| GEITje-7B-chat-v2 | Chat Models | 72.4 | 50.79 | 50.34 | 74.13 | 49.0 | 43.55 | 71.51 | 16.22 | MistralForCausalLM |
| LLaMa-2-PeanutButter_v10-7B | Chat Models | 70 | 50.75 | 55.29 | 81.69 | 46.97 | 43.78 | 70.88 | 5.91 | Unknown |
| starling-7B | Chat Models | 70 | 50.73 | 51.02 | 76.77 | 47.75 | 48.18 | 70.56 | 10.08 | LlamaForCausalLM |
| vicuna-7b-v1.3-attention-sparsity-20 | Chat Models | 67.4 | 50.63 | 52.3 | 77.05 | 47.39 | 46.62 | 69.22 | 11.22 | LlamaForCausalLM |
| llama-2-7b-guanaco-instruct-sharded | Chat Models | 67.4 | 50.58 | 53.75 | 78.69 | 46.65 | 43.93 | 72.61 | 7.81 | LlamaForCausalLM |
| WizardCoder-Python-34B-V1.0 | Chat Models | 340 | 50.46 | 52.13 | 74.78 | 49.15 | 48.85 | 68.35 | 9.48 | LlamaForCausalLM |
| vicuna-7b-v1.3-attention-sparsity-30 | Chat Models | 67.4 | 50.33 | 51.02 | 76.41 | 46.83 | 46.06 | 69.3 | 12.36 | LlamaForCausalLM |
| Asclepius-Llama2-13B | Chat Models | 130 | 50.25 | 55.89 | 79.66 | 52.38 | 40.76 | 72.69 | 0.15 | LlamaForCausalLM |
| Llama-2-7b-chat-hf-afr-200step-v2 | Chat Models | 70 | 50.21 | 51.79 | 77.41 | 48.55 | 43.69 | 71.9 | 7.88 | LlamaForCausalLM |
| llama-7b-SFT-qlora-eli5-wiki_DPO_ds_RM_top_2_1024_r_64_alpha_16 | Chat Models | 70 | 49.98 | 54.1 | 78.74 | 45.44 | 43.4 | 73.64 | 4.55 | Unknown |
| Platypus2-7B | Chat Models | 67.4 | 49.97 | 55.2 | 78.84 | 49.83 | 40.64 | 73.48 | 1.82 | LlamaForCausalLM |
| LLaMa-2-PeanutButter_v18_A-7B | Chat Models | 70 | 49.88 | 53.16 | 78.11 | 45.54 | 40.37 | 74.9 | 7.2 | Unknown |
| ELYZA-japanese-Llama-2-7b-instruct | Chat Models | 70 | 49.78 | 53.16 | 78.25 | 47.07 | 39.08 | 73.24 | 7.88 | LlamaForCausalLM |
| llama-2-7b-hf_open-platypus | Chat Models | 67.4 | 49.73 | 51.45 | 78.63 | 43.6 | 43.71 | 74.43 | 6.6 | LlamaForCausalLM |
| Llama-2-7B-32K-Instruct | Chat Models | 70 | 49.65 | 51.37 | 78.47 | 45.53 | 45.01 | 72.85 | 4.7 | LlamaForCausalLM |
| ALMA-13B-R | Chat Models | 130.2 | 49.32 | 55.55 | 79.45 | 49.52 | 36.09 | 75.3 | 0.0 | ? |
| odia_llama2_7B_base | Chat Models | 70 | 49.3 | 50.77 | 75.94 | 46.1 | 37.27 | 70.8 | 14.94 | LlamaForCausalLM |
| llama-v2-7b-32kC-Security | Chat Models | 66.1 | 49.19 | 49.83 | 77.33 | 44.41 | 47.96 | 71.74 | 3.87 | Unknown |
| tora-code-34b-v1.0 | Chat Models | 340 | 48.95 | 50.43 | 75.54 | 46.78 | 39.66 | 68.19 | 13.12 | LlamaForCausalLM |
| youri-7b-chat | Chat Models | 67.4 | 48.51 | 51.19 | 76.09 | 46.06 | 41.17 | 75.06 | 1.52 | LlamaForCausalLM |
| tora-7b-v1.0 | Chat Models | 70 | 48.5 | 52.47 | 78.68 | 45.9 | 37.9 | 73.56 | 2.5 | LlamaForCausalLM |
| mhm-7b-v1.3-DPO-1 | Chat Models | 72.4 | 47.77 | 49.57 | 68.1 | 45.76 | 45.88 | 62.04 | 15.24 | MistralForCausalLM |
| openthaigpt-1.0.0-alpha-7b-chat-ckpt-hf | Chat Models | 70 | 47.65 | 50.85 | 74.89 | 40.02 | 47.23 | 69.06 | 3.87 | LlamaForCausalLM |
| baize-healthcare-lora-7B | Chat Models | 70 | 47.62 | 54.1 | 77.32 | 37.09 | 39.96 | 72.85 | 4.4 | Unknown |
| mixtral-ko-qna-merged | Chat Models | 467 | 47.24 | 39.51 | 39.06 | 71.86 | 48.61 | 56.75 | 27.67 | Unknown |
| Asclepius-Llama2-7B | Chat Models | 70 | 47.15 | 50.85 | 76.53 | 43.61 | 43.31 | 68.27 | 0.3 | LlamaForCausalLM |
| dolphin-2.2-yi-34b-200k | Chat Models | 340 | 46.67 | 42.15 | 68.18 | 55.47 | 45.93 | 64.56 | 3.71 | Unknown |
| quan-1.8b-chat | Chat Models | 18 | 45.91 | 39.08 | 62.37 | 44.09 | 43.15 | 59.27 | 27.52 | LlamaForCausalLM |
| CodeLlama-13b-Instruct-hf | Chat Models | 130.2 | 45.82 | 44.54 | 64.93 | 38.89 | 45.88 | 68.03 | 12.66 | LlamaForCausalLM |
| CodeLlama-13B-Instruct-fp16 | Chat Models | 130.2 | 45.82 | 44.62 | 64.94 | 38.77 | 45.88 | 68.03 | 12.66 | LlamaForCausalLM |
| speechless-codellama-platypus-13b | Chat Models | 130 | 45.64 | 45.31 | 68.63 | 42.82 | 42.38 | 65.59 | 9.1 | LlamaForCausalLM |
| codellama-13b-oasst-sft-v10 | Chat Models | 130.2 | 44.85 | 45.39 | 62.36 | 35.36 | 45.02 | 67.8 | 13.19 | LlamaForCausalLM |
| speechless-codellama-orca-13b | Chat Models | 130 | 44.83 | 44.37 | 65.2 | 43.46 | 45.94 | 64.01 | 5.99 | LlamaForCausalLM |
| palmyra-med-20b | Chat Models | 200 | 44.71 | 46.93 | 73.51 | 44.34 | 35.47 | 65.35 | 2.65 | GPT2LMHeadModel |
| Poro-34B-GPTQ | Chat Models | 480.6 | 44.67 | 47.01 | 73.75 | 32.47 | 38.37 | 71.35 | 5.08 | BloomForCausalLM |
| h2o-danube-1.8b-chat | Chat Models | 18.3 | 44.49 | 41.13 | 68.06 | 33.41 | 41.64 | 65.35 | 17.36 | MistralForCausalLM |
| falcon_7b_norobots | Chat Models | 70 | 44.46 | 47.87 | 77.92 | 27.94 | 36.81 | 71.74 | 4.47 | Unknown |
| CodeLlama-34b-Instruct-hf | Chat Models | 337.4 | 44.33 | 40.78 | 35.66 | 39.72 | 44.29 | 74.51 | 31.01 | LlamaForCausalLM |
| calm2-7b-chat-dpo-experimental | Chat Models | 70.1 | 44.03 | 41.04 | 68.99 | 39.82 | 43.13 | 65.67 | 5.53 | LlamaForCausalLM |
| gpt-sw3-20b-instruct | Chat Models | 209.2 | 43.7 | 43.17 | 71.09 | 31.32 | 41.02 | 66.77 | 8.79 | GPT2LMHeadModel |
| falcon_7b_3epoch_norobots | Chat Models | 70 | 43.65 | 47.61 | 77.24 | 29.73 | 36.27 | 69.53 | 1.52 | Unknown |
| deepseek-coder-6.7b-instruct | Chat Models | 67.4 | 43.57 | 38.14 | 55.09 | 39.02 | 45.56 | 56.83 | 26.76 | LlamaForCausalLM |
| calm2-7b-chat | Chat Models | 70 | 43.27 | 40.27 | 68.12 | 39.39 | 41.96 | 64.96 | 4.93 | LlamaForCausalLM |
| falcon-7b-instruct | Chat Models | 70 | 43.26 | 46.16 | 70.85 | 25.84 | 44.08 | 67.96 | 4.7 | FalconForCausalLM |
| falcon-7b-instruct | Chat Models | 70 | 43.16 | 45.82 | 70.78 | 25.66 | 44.07 | 68.03 | 4.62 | FalconForCausalLM |
| ex-llm-e1 | Chat Models | 0 | 43.11 | 39.93 | 68.11 | 39.44 | 42.01 | 64.88 | 4.32 | Unknown |
| CodeLlama-34B-Instruct-fp16 | Chat Models | 337.4 | 43.0 | 40.78 | 35.66 | 39.72 | 44.29 | 74.51 | 23.05 | LlamaForCausalLM |
| Deita-1_8B | Chat Models | 80 | 42.96 | 36.52 | 60.63 | 45.62 | 40.02 | 59.35 | 15.62 | LlamaForCausalLM |
| Qwen-1_8B-Chat-llama | Chat Models | 18.4 | 42.94 | 36.95 | 54.34 | 44.55 | 43.7 | 58.88 | 19.26 | LlamaForCausalLM |
| InstructPalmyra-20b | Chat Models | 200 | 42.91 | 47.1 | 73.0 | 28.26 | 41.81 | 64.72 | 2.58 | GPT2LMHeadModel |
| Qwen-1_8b-EverythingLM | Chat Models | 18.4 | 42.77 | 38.65 | 62.66 | 44.94 | 38.7 | 58.96 | 12.74 | LlamaForCausalLM |
| tora-code-13b-v1.0 | Chat Models | 130 | 42.7 | 44.45 | 69.29 | 36.67 | 34.98 | 62.59 | 8.19 | LlamaForCausalLM |
| Galpaca-30b-MiniOrca | Chat Models | 299.7 | 42.23 | 48.89 | 57.8 | 43.72 | 41.1 | 60.06 | 1.82 | OPTForCausalLM |
| open-llama-3b-v2-instruct | Chat Models | 34.3 | 42.02 | 38.48 | 70.24 | 39.69 | 37.96 | 65.75 | 0.0 | LlamaForCausalLM |
| gpt-sw3-6.7b-v2-instruct | Chat Models | 71.1 | 41.72 | 40.78 | 67.77 | 31.57 | 40.32 | 63.54 | 6.37 | GPT2LMHeadModel |
| shearedplats-2.7b-v2 | Chat Models | 27 | 41.61 | 42.41 | 72.58 | 27.52 | 39.76 | 65.9 | 1.52 | LlamaForCausalLM |
| MiniMerlin-3b-v0.1 | Chat Models | 30.2 | 41.6 | 40.7 | 54.06 | 43.32 | 49.65 | 60.54 | 1.36 | LlamaForCausalLM |