加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| shearedplats-2.7b-v2-instruct-v0.1 | Chat Models | 27 | 41.13 | 40.19 | 70.08 | 28.12 | 41.23 | 65.04 | 2.12 | LlamaForCausalLM |
| open-llama-3b-v2-chat | Chat Models | 34.3 | 40.93 | 40.61 | 70.3 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 28.73 |
| 37.84 |
| 65.51 |
| 2.58 |
| LlamaForCausalLM |
| smartyplats-3b-v2 | Chat Models | 30 | 40.29 | 41.04 | 71.19 | 24.32 | 36.66 | 66.93 | 1.59 | LlamaForCausalLM |
| openllama_3b_EvolInstruct_lora_merged | Chat Models | 30 | 40.28 | 40.27 | 71.6 | 27.12 | 34.78 | 67.01 | 0.91 | LlamaForCausalLM |
| CodeLlama-34B-Python-fp16 | Chat Models | 337.4 | 40.27 | 38.14 | 34.8 | 32.95 | 43.57 | 72.14 | 20.02 | LlamaForCausalLM |
| CodeLlama-34b-Python-hf | Chat Models | 337.4 | 40.27 | 40.19 | 36.82 | 34.79 | 44.28 | 71.19 | 14.33 | LlamaForCausalLM |
| tora-code-7b-v1.0 | Chat Models | 70 | 40.21 | 40.7 | 65.86 | 33.34 | 34.84 | 61.56 | 4.93 | LlamaForCausalLM |
| CodeLlama-7b-Instruct-hf | Chat Models | 67.4 | 40.05 | 36.52 | 55.44 | 34.54 | 41.25 | 64.56 | 7.96 | LlamaForCausalLM |
| open_llama_3b_code_instruct_0.1 | Chat Models | 34.3 | 39.72 | 41.21 | 66.96 | 27.82 | 35.01 | 65.43 | 1.9 | LlamaForCausalLM |
| weblab-10b-instruction-sft | Chat Models | 100 | 39.13 | 40.1 | 65.3 | 26.66 | 36.79 | 64.09 | 1.82 | GPTNeoXForCausalLM |
| deacon-3b | Chat Models | 34.3 | 39.05 | 39.68 | 66.42 | 27.13 | 36.07 | 64.64 | 0.38 | LlamaForCausalLM |
| black_goo_recipe_c | Chat Models | 0 | 39.01 | 38.74 | 66.83 | 26.57 | 36.54 | 64.72 | 0.68 | LlamaForCausalLM |
| cross_lingual_epoch2 | Chat Models | 0 | 38.97 | 39.25 | 47.92 | 36.66 | 47.9 | 62.12 | 0.0 | LlamaForCausalLM |
| open_llama_3b_instruct_v_0.2 | Chat Models | 34.3 | 38.97 | 38.48 | 66.77 | 25.34 | 38.16 | 63.46 | 1.59 | LlamaForCausalLM |
| WizardVicuna-open-llama-3b-v2 | Chat Models | 34.3 | 38.77 | 37.71 | 66.6 | 27.23 | 36.8 | 63.3 | 0.99 | LlamaForCausalLM |
| black_goo_recipe_a | Chat Models | 0 | 38.73 | 38.14 | 66.56 | 25.75 | 37.46 | 63.93 | 0.53 | LlamaForCausalLM |
| black_goo_recipe_d | Chat Models | 0 | 38.57 | 37.8 | 66.5 | 26.64 | 36.46 | 63.61 | 0.38 | LlamaForCausalLM |
| LLongMA-3b-LIMA | Chat Models | 30 | 38.51 | 39.08 | 67.15 | 26.43 | 34.71 | 63.38 | 0.3 | LlamaForCausalLM |
| black_goo_recipe_b | Chat Models | 0 | 38.49 | 37.63 | 66.72 | 25.68 | 37.09 | 63.77 | 0.08 | LlamaForCausalLM |
| falcon-rw-1b-instruct-openorca | Chat Models | 13.1 | 37.63 | 34.56 | 60.93 | 28.77 | 37.42 | 60.69 | 3.41 | FalconForCausalLM |
| Evaloric-1.1B | Chat Models | 11 | 37.54 | 35.07 | 60.93 | 25.36 | 37.78 | 64.96 | 1.14 | LlamaForCausalLM |
| falcon-rw-1b-chat | Chat Models | 13.1 | 37.37 | 35.58 | 61.12 | 24.51 | 39.62 | 61.72 | 1.67 | FalconForCausalLM |
| TinyLlama-1.1B-orca-v1.0 | Chat Models | 11 | 37.17 | 36.35 | 61.23 | 25.18 | 36.58 | 61.4 | 2.27 | LlamaForCausalLM |
| TinyLlama-1.1B-Chat-v1.0-intel-dpo | Chat Models | 11 | 37.09 | 35.84 | 61.29 | 25.05 | 37.38 | 61.01 | 1.97 | LlamaForCausalLM |
| TinyNaughtyLlama-v1.0 | Chat Models | 11 | 37.03 | 35.92 | 61.04 | 25.82 | 36.77 | 60.22 | 2.43 | LlamaForCausalLM |
| CodeLlama-13b-Python-hf | Chat Models | 130.2 | 37.0 | 32.59 | 43.94 | 27.23 | 44.59 | 65.04 | 8.64 | LlamaForCausalLM |
| Phind-CodeLlama-34B-v2 | Chat Models | 340 | 36.89 | 24.57 | 27.6 | 25.76 | 48.37 | 71.82 | 23.2 | LlamaForCausalLM |
| CodeLlama-7b-Python-hf | Chat Models | 67.4 | 36.89 | 31.31 | 52.86 | 27.32 | 42.21 | 63.06 | 4.55 | LlamaForCausalLM |
| dopeyshearedplats-1.3b-v1 | Chat Models | 13 | 36.74 | 34.39 | 64.31 | 25.4 | 38.21 | 57.38 | 0.76 | LlamaForCausalLM |
| 42dot_LLM-SFT-1.3B | Chat Models | 14.4 | 36.61 | 36.09 | 58.96 | 25.51 | 39.98 | 58.41 | 0.68 | LlamaForCausalLM |
| Deer-3b | Chat Models | 30 | 36.55 | 38.48 | 57.41 | 25.64 | 39.98 | 57.46 | 0.3 | BloomForCausalLM |
| CodeLlama-7b-Python-hf | Chat Models | 67.4 | 36.42 | 29.27 | 50.12 | 28.37 | 41.61 | 64.01 | 5.16 | LlamaForCausalLM |
| TinyDolphin-2.8-1.1b | Chat Models | 11 | 36.34 | 34.3 | 59.44 | 25.59 | 36.51 | 60.69 | 1.52 | LlamaForCausalLM |
| TinyDolphin-2.8.1-1.1b | Chat Models | 11 | 36.21 | 34.98 | 60.11 | 25.31 | 35.51 | 60.69 | 0.68 | LlamaForCausalLM |
| Deacon-1_8b | Chat Models | 18.4 | 36.03 | 33.7 | 52.33 | 33.97 | 39.05 | 57.14 | 0.0 | LlamaForCausalLM |
| TinyOpenHermes-1.1B-4k | Chat Models | 11 | 35.98 | 33.62 | 58.53 | 26.45 | 37.33 | 59.91 | 0.08 | LlamaForCausalLM |
| shearedplats-1.3b-v1 | Chat Models | 13 | 35.97 | 35.41 | 62.75 | 24.75 | 33.93 | 58.48 | 0.53 | LlamaForCausalLM |
| TinyDolphin-2.8.2-1.1b-laser | Chat Models | 11 | 35.93 | 33.36 | 58.53 | 25.93 | 36.33 | 60.14 | 1.29 | LlamaForCausalLM |
| opt-flan-iml-6.7b | Chat Models | 66.6 | 35.84 | 30.12 | 58.82 | 25.12 | 36.74 | 64.25 | 0.0 | OPTForCausalLM |
| Walter-Llama-1B | Chat Models | 11 | 35.29 | 32.85 | 61.05 | 27.46 | 33.93 | 56.43 | 0.0 | LlamaForCausalLM |
| dopeyplats-1.1b-2T-v1 | Chat Models | 11 | 35.28 | 33.11 | 54.31 | 24.55 | 39.26 | 58.8 | 1.67 | LlamaForCausalLM |
| platypus-1_8b | Chat Models | 18.4 | 35.24 | 33.28 | 50.76 | 33.25 | 40.73 | 52.96 | 0.45 | LlamaForCausalLM |
| Deacon-1b | Chat Models | 11 | 35.21 | 32.42 | 58.62 | 24.89 | 35.05 | 59.59 | 0.68 | LlamaForCausalLM |
| TinyWand-DPO | Chat Models | 16.3 | 35.13 | 31.66 | 50.42 | 26.22 | 45.8 | 54.78 | 1.9 | LlamaForCausalLM |
| falcon-1b-t-sft | Chat Models | 13.1 | 35.02 | 32.94 | 57.24 | 25.26 | 38.49 | 55.88 | 0.3 | FalconForCausalLM |
| Tiny-Vicuna-1B | Chat Models | 11 | 34.76 | 33.45 | 55.92 | 25.45 | 33.82 | 58.41 | 1.52 | LlamaForCausalLM |
| megachat | Chat Models | 0 | 34.75 | 30.8 | 54.35 | 25.55 | 39.85 | 56.99 | 0.99 | LlamaForCausalLM |
| TinyWand-SFT | Chat Models | 16.3 | 34.61 | 31.4 | 49.96 | 25.98 | 43.08 | 55.17 | 2.05 | LlamaForCausalLM |
| gpt-sw3-1.3b-instruct | Chat Models | 14.4 | 34.54 | 30.97 | 51.42 | 26.17 | 40.31 | 56.75 | 1.59 | GPT2LMHeadModel |
| tinyllama-1.1b-chat-v0.3_platypus | Chat Models | 11 | 34.5 | 30.29 | 55.12 | 26.13 | 39.15 | 55.8 | 0.53 | LlamaForCausalLM |
| gpt2-xl_lima | Chat Models | 15.6 | 34.12 | 31.14 | 51.28 | 25.43 | 38.74 | 57.22 | 0.91 | GPT2LMHeadModel |
| Walter-Falcon-1B | Chat Models | 13.1 | 34.07 | 31.06 | 54.92 | 24.58 | 38.47 | 55.41 | 0.0 | FalconForCausalLM |
| gpt-2-xl_camel-ai-physics | Chat Models | 15.6 | 33.96 | 29.52 | 50.62 | 26.79 | 39.12 | 57.54 | 0.15 | GPT2LMHeadModel |
| bilingual-gpt-neox-4b-instruction-sft | Chat Models | 38 | 32.46 | 28.07 | 47.5 | 23.12 | 43.76 | 52.33 | 0.0 | GPTNeoXForCausalLM |
| deepseek-coder-1.3b-instruct | Chat Models | 13 | 32.4 | 28.58 | 39.87 | 28.47 | 44.02 | 52.41 | 1.06 | LlamaForCausalLM |
| gpt2-large-conversational | Chat Models | 7.7 | 32.33 | 26.96 | 44.98 | 26.33 | 39.6 | 56.04 | 0.08 | GPT2LMHeadModel |
| FLOR-1.3B-xat | Chat Models | 13.1 | 32.27 | 26.79 | 41.63 | 26.65 | 44.38 | 53.43 | 0.76 | BloomForCausalLM |
| llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0 | Chat Models | 130 | 31.77 | 26.88 | 44.78 | 23.12 | 45.19 | 50.67 | 0.0 | GPT2LMHeadModel |
| llm-jp-13b-instruct-full-jaster-v1.0 | Chat Models | 130 | 31.63 | 27.22 | 44.7 | 23.12 | 44.69 | 50.04 | 0.0 | GPT2LMHeadModel |
| Aira-2-774M | Chat Models | 7.7 | 31.33 | 28.75 | 40.8 | 25.1 | 41.33 | 52.01 | 0.0 | GPT2LMHeadModel |
| Aira-2-355M | Chat Models | 3.6 | 31.0 | 27.56 | 38.92 | 27.26 | 38.53 | 53.75 | 0.0 | GPT2LMHeadModel |
| GPTNeo350M-Instruct-SFT | Chat Models | 4.6 | 31.0 | 25.94 | 38.55 | 25.76 | 45.25 | 50.2 | 0.3 | GPTNeoForCausalLM |
| gpt-sw3-356m-instruct | Chat Models | 4.7 | 30.93 | 26.96 | 38.01 | 25.53 | 40.74 | 52.57 | 1.74 | GPT2LMHeadModel |
| speechless-codellama-orca-airoboros-13b-0.10e | Chat Models | 130.2 | 30.36 | 29.44 | 25.71 | 25.43 | 49.64 | 51.93 | 0.0 | LlamaForCausalLM |
| flyingllama-v2 | Chat Models | 4.6 | 30.19 | 24.74 | 38.44 | 26.37 | 41.3 | 50.28 | 0.0 | LlamaForCausalLM |
| bloom-1b1-RLHF | Chat Models | 0.2 | 30.14 | 27.99 | 26.19 | 26.86 | 48.88 | 50.91 | 0.0 | Unknown |
| cutie | Chat Models | 72.4 | 29.87 | 26.96 | 27.02 | 24.17 | 48.42 | 52.64 | 0.0 | Unknown |
| speechless-codellama-orca-platypus-13b-0.10e | Chat Models | 130.2 | 29.83 | 28.75 | 25.88 | 25.36 | 49.27 | 49.72 | 0.0 | LlamaForCausalLM |
| Llama-68M-Chat-v1 | Chat Models | 0.7 | 29.72 | 23.29 | 28.27 | 25.18 | 47.27 | 54.3 | 0.0 | LlamaForCausalLM |
| mistral-7b-dpo-open-orca-flan-50k-synthetic-5-models | Chat Models | 72.4 | 29.48 | 25.51 | 25.52 | 26.82 | 48.81 | 50.2 | 0.0 | MistralForCausalLM |
| zephyr-smol_llama-100m-dpo-full | Chat Models | 1 | 29.37 | 25.0 | 28.54 | 25.18 | 45.75 | 51.07 | 0.68 | LlamaForCausalLM |
| smol_llama-220M-openhermes | Chat Models | 2.2 | 29.34 | 25.17 | 28.98 | 26.17 | 43.08 | 52.01 | 0.61 | LlamaForCausalLM |
| zephyr-220m-dpo-full | Chat Models | 2.2 | 29.33 | 25.43 | 29.15 | 26.43 | 43.44 | 50.99 | 0.53 | MistralForCausalLM |
| zephyr-220m-sft-full | Chat Models | 2.2 | 29.33 | 25.26 | 29.03 | 26.45 | 43.23 | 51.62 | 0.38 | MistralForCausalLM |
| Aira-2-1B1 | Chat Models | 11 | 29.32 | 23.21 | 26.97 | 24.86 | 50.63 | 50.28 | 0.0 | LlamaForCausalLM |
| llama2-13b-platypus-ckpt-1000 | Chat Models | 128.5 | 29.28 | 28.16 | 26.55 | 23.17 | 48.79 | 49.01 | 0.0 | Unknown |
| changpt-bart | Chat Models | 1.8 | 29.27 | 28.67 | 26.41 | 23.12 | 47.94 | 49.49 | 0.0 | Unknown |
| gpt2-dolly | Chat Models | 1.2 | 29.21 | 22.7 | 30.15 | 25.81 | 44.97 | 51.46 | 0.15 | GPT2LMHeadModel |
| smol_llama-220M-open_instruct | Chat Models | 2.2 | 29.19 | 25.0 | 29.71 | 26.11 | 44.06 | 50.28 | 0.0 | LlamaForCausalLM |
| gpt2_open-platypus | Chat Models | 1.2 | 28.58 | 22.18 | 31.29 | 26.19 | 40.35 | 51.3 | 0.15 | GPT2LMHeadModel |
| KoAlpaca-KoRWKV-6B | Chat Models | 65.3 | 28.57 | 23.46 | 31.65 | 24.89 | 39.83 | 51.62 | 0.0 | RwkvForCausalLM |
| gpt2_guanaco-dolly-platypus | Chat Models | 1.2 | 28.52 | 23.55 | 31.03 | 26.4 | 40.02 | 50.12 | 0.0 | GPT2LMHeadModel |
| gpt2_platypus-dolly-guanaco | Chat Models | 1.2 | 28.51 | 23.21 | 31.04 | 26.16 | 40.31 | 50.36 | 0.0 | GPT2LMHeadModel |
| Mixsmol-4x400M-v0.1-epoch1 | Chat Models | 17.7 | 28.45 | 22.87 | 30.57 | 25.28 | 39.03 | 52.8 | 0.15 | MixtralForCausalLM |
| gpt2_camel_physics-platypus | Chat Models | 1.2 | 28.41 | 23.04 | 31.32 | 26.91 | 39.56 | 49.64 | 0.0 | GPT2LMHeadModel |
| gpt2_platypus-camel_physics | Chat Models | 1.2 | 28.41 | 23.04 | 31.32 | 26.91 | 39.56 | 49.64 | 0.0 | Unknown |
| gpt2_platypus-camel_physics | Chat Models | 1.2 | 28.4 | 22.78 | 31.24 | 25.87 | 38.95 | 51.54 | 0.0 | Unknown |
| GPT-2-SlimOrcaDeduped-airoboros-3.1-MetaMathQA-SFT-124M | Chat Models | 1.2 | 28.3 | 24.57 | 29.43 | 25.82 | 38.84 | 49.01 | 2.12 | Unknown |
| gpt-sw3-126m-instruct | Chat Models | 1.9 | 28.2 | 23.38 | 29.88 | 23.78 | 42.65 | 48.54 | 0.99 | GPT2LMHeadModel |
| TinyMistral-248M-SFT-v4 | Chat Models | 2.5 | 28.2 | 24.91 | 28.15 | 26.04 | 39.56 | 50.51 | 0.0 | MistralForCausalLM |
| TinyMistral-248M-Instruct | Chat Models | 2.5 | 28.19 | 24.32 | 27.52 | 25.18 | 41.94 | 50.2 | 0.0 | MistralForCausalLM |
| gpt2-dolly | Chat Models | 1.2 | 28.18 | 21.76 | 30.77 | 24.66 | 42.22 | 49.57 | 0.08 | GPT2LMHeadModel |
| TinyMistral-6x248M-Instruct | Chat Models | 10 | 27.89 | 22.44 | 27.02 | 24.13 | 43.16 | 50.59 | 0.0 | MixtralForCausalLM |
| TinyMistral-248M-SFT-v3 | Chat Models | 2.5 | 27.45 | 21.93 | 28.26 | 22.91 | 40.03 | 51.54 | 0.0 | Unknown |
| v1olet_mistral_7B | Chat Models | 70 | 22.16 | 29.18 | 28.13 | 26.24 | 0.0 | 49.41 | 0.0 | Unknown |
| Sakura-SOLAR-Instruct-DPO-v1 | Chat Models | 107.3 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | Unknown |
| Pythia-31M-Chat-v1 | Chat Models | 0.3 | 19.92 | 22.7 | 25.6 | 23.24 | 0.0 | 47.99 | 0.0 | GPTNeoXForCausalLM |