加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
数据来源: HuggingFace
| 模型名称 | 模型类型 | 参数大小(亿) | 平均分 | ARC分数 | HellaSwag分数 | MMLU分数 | TruthfulQA分数 | Winogrande分数 | GSM8K分数 | 模型架构 |
|---|---|---|---|---|---|---|---|---|---|---|
| h2ogpt-gm-oasst1-en-1024-20b | Fine Tuned Models | 200 | 42.58 | 48.04 | 72.76 | 25.96 | 39.92 | 66.3 | 2.5 | GPTNeoXForCausalLM |
| llama-2-13b-chat-hf-phr_mental_therapy | Fine Tuned Models | 130 | 42.5 | 38.82 | 72.76 |
数据仅供参考,以官方来源为准。模型名称旁的链接可跳转到 DataLearner 模型详情页。
| 23.12 |
| 46.92 |
| 65.59 |
| 7.81 |
| LlamaForCausalLM |
| h2ogpt-oasst1-512-20b | Fine Tuned Models | 200 | 42.44 | 46.93 | 72.77 | 26.25 | 37.5 | 68.03 | 3.18 | GPTNeoXForCausalLM |
| codellama_7b_DolphinCoder | Fine Tuned Models | 70 | 42.39 | 41.98 | 65.5 | 38.11 | 35.45 | 63.61 | 9.7 | Unknown |
| codellama_7b_DolphinCoder | Fine Tuned Models | 70 | 42.39 | 41.98 | 65.5 | 38.11 | 35.45 | 63.61 | 9.7 | Unknown |
| LL7M | Fine Tuned Models | 0.1 | 42.38 | 44.97 | 68.81 | 34.44 | 41.39 | 64.09 | 0.61 | LlamaForCausalLM |
| RedPajama-INCITE-7B-Instruct | Fine Tuned Models | 70 | 42.38 | 44.11 | 72.02 | 37.62 | 33.96 | 64.96 | 1.59 | GPTNeoXForCausalLM |
| RedPajama-INCITE-Instruct-7B-v0.1 | Fine Tuned Models | 66.5 | 42.38 | 44.11 | 72.02 | 37.62 | 33.96 | 64.96 | 1.59 | Unknown |
| open_llama_7b | Pretrained Models | 70 | 42.31 | 47.01 | 71.98 | 30.49 | 34.85 | 67.96 | 1.59 | LlamaForCausalLM |
| bloomz-7b1-mt-sft-chat | Fine Tuned Models | 70.7 | 42.24 | 44.03 | 62.6 | 38.64 | 44.34 | 63.3 | 0.53 | BloomForCausalLM |
| Galpaca-30b-MiniOrca | Chat Models | 299.7 | 42.23 | 48.89 | 57.8 | 43.72 | 41.1 | 60.06 | 1.82 | OPTForCausalLM |
| pythia-12b-sft-v8-7k-steps | Fine Tuned Models | 120 | 42.21 | 44.03 | 70.28 | 26.55 | 36.53 | 65.27 | 10.61 | GPTNeoXForCausalLM |
| bloomz-7b1 | Unkown Model Types | 70 | 42.21 | 42.49 | 63.01 | 37.85 | 45.2 | 64.64 | 0.08 | BloomForCausalLM |
| bloomz-7b1-mt | Unkown Model Types | 70 | 42.14 | 43.86 | 62.91 | 37.35 | 45.65 | 63.06 | 0.0 | BloomForCausalLM |
| Sheared-LLaMA-2.7B-ShareGPT | Fine Tuned Models | 27 | 42.11 | 41.04 | 71.26 | 28.5 | 47.71 | 64.17 | 0.0 | LlamaForCausalLM |
| palmyra-large | Pretrained Models | 0 | 42.09 | 44.97 | 71.85 | 28.54 | 35.93 | 67.88 | 3.41 | GPT2LMHeadModel |
| pygmalion-6b-vicuna-chatml | Fine Tuned Models | 60 | 42.08 | 40.61 | 67.73 | 33.92 | 42.76 | 63.06 | 4.4 | GPTJForCausalLM |
| Marx-3B-V2 | Fine Tuned Models | 34.3 | 42.08 | 44.03 | 72.92 | 27.84 | 39.92 | 66.54 | 1.21 | LlamaForCausalLM |
| speechless-tora-code-7b-v1.0 | Fine Tuned Models | 70 | 42.04 | 42.66 | 65.16 | 38.56 | 42.06 | 62.9 | 0.91 | LlamaForCausalLM |
| open-llama-3b-v2-instruct | Chat Models | 34.3 | 42.02 | 38.48 | 70.24 | 39.69 | 37.96 | 65.75 | 0.0 | LlamaForCausalLM |
| opt-30b | Pretrained Models | 300 | 42.0 | 43.26 | 74.07 | 26.66 | 35.16 | 70.64 | 2.2 | OPTForCausalLM |
| datascience-coder-6.7b | Fine Tuned Models | 67 | 41.99 | 34.64 | 53.83 | 37.96 | 44.82 | 55.72 | 24.94 | LlamaForCausalLM |
| pythia-12b-sft-v8-2.5k-steps | Fine Tuned Models | 120 | 41.97 | 42.32 | 70.15 | 27.36 | 36.75 | 65.67 | 9.55 | GPTNeoXForCausalLM |
| h2ogpt-gm-oasst1-multilang-1024-20b | Fine Tuned Models | 200 | 41.9 | 47.44 | 72.58 | 26.37 | 34.39 | 68.43 | 2.2 | GPTNeoXForCausalLM |
| GPT-JT-Moderation-6B | Unkown Model Types | 60 | 41.8 | 40.53 | 67.66 | 41.63 | 37.33 | 62.67 | 0.99 | GPTJForCausalLM |
| Barcenas-3b | Fine Tuned Models | 30 | 41.74 | 43.17 | 67.82 | 29.16 | 41.56 | 66.22 | 2.5 | LlamaForCausalLM |
| gpt-sw3-6.7b-v2-instruct | Chat Models | 71.1 | 41.72 | 40.78 | 67.77 | 31.57 | 40.32 | 63.54 | 6.37 | GPT2LMHeadModel |
| Marx-3B | Fine Tuned Models | 34.3 | 41.71 | 43.17 | 72.68 | 28.46 | 39.09 | 65.59 | 1.29 | LlamaForCausalLM |
| gpt-neox-20b | Pretrained Models | 207.4 | 41.69 | 45.73 | 73.45 | 25.0 | 31.61 | 68.9 | 5.46 | GPTNeoXForCausalLM |
| pythia-12b-sft-v8-rlhf-2k-steps | Unkown Model Types | 120 | 41.65 | 43.43 | 70.08 | 26.12 | 36.06 | 64.64 | 9.55 | GPTNeoXForCausalLM |
| shearedplats-2.7b-v2 | Chat Models | 27 | 41.61 | 42.41 | 72.58 | 27.52 | 39.76 | 65.9 | 1.52 | LlamaForCausalLM |
| MiniMerlin-3b-v0.1 | Chat Models | 30.2 | 41.6 | 40.7 | 54.06 | 43.32 | 49.65 | 60.54 | 1.36 | LlamaForCausalLM |
| glaive-coder-7b | Fine Tuned Models | 70 | 41.56 | 42.66 | 64.69 | 37.15 | 39.88 | 59.75 | 5.23 | LlamaForCausalLM |
| RedPajama-INCITE-7B-Base | Pretrained Models | 70 | 41.49 | 46.25 | 71.63 | 27.68 | 33.03 | 67.32 | 3.03 | GPTNeoXForCausalLM |
| gpt4all-j | Unkown Model Types | 0 | 41.49 | 41.98 | 64.06 | 28.2 | 42.78 | 64.72 | 7.2 | GPTJForCausalLM |
| gpt4all-j | Unkown Model Types | 0 | 41.49 | 41.98 | 64.06 | 28.2 | 42.78 | 64.72 | 7.2 | GPTJForCausalLM |
| open-llama-3b-v2-wizard-evol-instuct-v2-196k | Fine Tuned Models | 34.3 | 41.46 | 41.81 | 73.01 | 26.36 | 38.99 | 66.69 | 1.9 | LlamaForCausalLM |
| MiniMA-3B | Pretrained Models | 30.2 | 41.44 | 43.43 | 68.06 | 28.69 | 39.76 | 65.98 | 2.73 | LlamaForCausalLM |
| open-llama-3b-everything-v2 | Fine Tuned Models | 34.3 | 41.41 | 42.83 | 73.28 | 26.87 | 37.26 | 66.61 | 1.59 | LlamaForCausalLM |
| mommygpt-3B | Fine Tuned Models | 34.3 | 41.36 | 41.89 | 71.69 | 28.74 | 37.9 | 65.82 | 2.12 | LlamaForCausalLM |
| orca_mini_13b | Unkown Model Types | 128.5 | 41.36 | 42.06 | 63.4 | 35.43 | 43.1 | 64.17 | 0.0 | Unknown |
| nucleus-22B-token-500B | Pretrained Models | 218.3 | 41.33 | 40.7 | 69.39 | 30.11 | 39.16 | 67.64 | 0.99 | LlamaForCausalLM |
| llama-2-34b-uncode | Fine Tuned Models | 337.4 | 41.33 | 39.51 | 33.9 | 38.49 | 40.94 | 74.35 | 20.77 | LlamaForCausalLM |
| oasst-sft-4-pythia-12b-epoch-3.5 | Fine Tuned Models | 120 | 41.31 | 45.73 | 68.59 | 26.82 | 37.81 | 65.9 | 3.03 | GPTNeoXForCausalLM |
| orca_mini_7b | Fine Tuned Models | 66.1 | 41.27 | 43.94 | 65.22 | 29.97 | 42.03 | 66.06 | 0.38 | Unknown |
| GPT-NeoX-20B-Erebus | Fine Tuned Models | 200 | 41.26 | 45.48 | 72.79 | 26.77 | 32.15 | 68.11 | 2.27 | GPTNeoXForCausalLM |
| RedPajama-INCITE-Base-7B-v0.1 | Pretrained Models | 66.5 | 41.25 | 46.25 | 71.63 | 27.68 | 33.03 | 67.32 | 1.59 | Unknown |
| mamba-gpt-3b-v4 | Fine Tuned Models | 34.3 | 41.24 | 42.58 | 71.04 | 30.04 | 37.26 | 65.82 | 0.68 | LlamaForCausalLM |
| open-llama-3b-v2-elmv3 | Fine Tuned Models | 34.3 | 41.14 | 42.06 | 73.28 | 27.61 | 35.54 | 64.96 | 3.41 | LlamaForCausalLM |
| Griffin-3B | Fine Tuned Models | 34.3 | 41.13 | 41.81 | 72.3 | 26.36 | 38.33 | 67.01 | 0.99 | LlamaForCausalLM |
| shearedplats-2.7b-v2-instruct-v0.1 | Chat Models | 27 | 41.13 | 40.19 | 70.08 | 28.12 | 41.23 | 65.04 | 2.12 | LlamaForCausalLM |
| open-llama-3b-v2-elmv3 | Fine Tuned Models | 34.3 | 41.13 | 42.15 | 73.26 | 27.16 | 35.51 | 64.96 | 3.71 | LlamaForCausalLM |
| speechless-coder-ds-6.7b | Fine Tuned Models | 67 | 41.11 | 36.86 | 52.46 | 38.08 | 41.67 | 58.88 | 18.73 | LlamaForCausalLM |
| open-llama-0.7T-7B-open-instruct-v1.1 | Fine Tuned Models | 70 | 41.11 | 46.67 | 67.67 | 28.55 | 37.6 | 65.43 | 0.76 | LlamaForCausalLM |
| mamba-gpt-3b-v3 | Fine Tuned Models | 34.3 | 41.11 | 41.72 | 71.05 | 27.31 | 37.86 | 67.48 | 1.21 | LlamaForCausalLM |
| pythia-12b-pre-v8-12.5k-steps | Fine Tuned Models | 120 | 41.1 | 41.47 | 68.8 | 26.58 | 36.82 | 65.27 | 7.66 | GPTNeoXForCausalLM |
| GPT-NeoX-20B-Skein | Fine Tuned Models | 200 | 41.1 | 44.97 | 72.68 | 25.99 | 31.64 | 68.43 | 2.88 | GPTNeoXForCausalLM |
| open-llama-3b-v2-wizard-evol-instuct-v2-196k | Fine Tuned Models | 34.3 | 41.09 | 41.21 | 72.88 | 25.39 | 38.87 | 66.61 | 1.59 | LlamaForCausalLM |
| FLAMA-0.1-3B | Fine Tuned Models | 30 | 41.07 | 41.72 | 71.41 | 26.59 | 37.19 | 66.54 | 2.96 | LlamaForCausalLM |
| OpenLlama-Platypus-3B | Fine Tuned Models | 34.3 | 41.05 | 41.21 | 71.67 | 29.86 | 36.45 | 65.98 | 1.14 | LlamaForCausalLM |
| Puma-3B | Fine Tuned Models | 34.3 | 41.02 | 41.3 | 71.85 | 27.51 | 38.34 | 66.38 | 0.76 | LlamaForCausalLM |
| wizard-orca-3b | Fine Tuned Models | 34.3 | 41.0 | 41.72 | 71.78 | 24.49 | 40.04 | 66.93 | 1.06 | LlamaForCausalLM |
| Amber | Pretrained Models | 0 | 40.97 | 40.96 | 73.79 | 26.84 | 33.56 | 67.88 | 2.81 | LlamaForCausalLM |
| open-llama-3b-claude-30k | Fine Tuned Models | 34.3 | 40.93 | 41.72 | 72.64 | 24.03 | 38.46 | 66.54 | 2.2 | LlamaForCausalLM |
| open-llama-3b-v2-chat | Chat Models | 34.3 | 40.93 | 40.61 | 70.3 | 28.73 | 37.84 | 65.51 | 2.58 | LlamaForCausalLM |
| deepseek-coder-6.7b-chat-and-function-calling | Fine Tuned Models | 67.4 | 40.91 | 36.09 | 53.8 | 38.29 | 42.83 | 57.22 | 17.21 | LlamaForCausalLM |
| deepseek-coder-6.7b-chat | Fine Tuned Models | 67.4 | 40.9 | 36.01 | 53.74 | 38.22 | 42.94 | 57.54 | 16.98 | LlamaForCausalLM |
| deepseek-coder-6.7b-chat | Fine Tuned Models | 67.4 | 40.9 | 35.75 | 53.7 | 38.19 | 42.94 | 58.01 | 16.83 | LlamaForCausalLM |
| Sheared-LLaMA-2.7B | Fine Tuned Models | 27 | 40.84 | 41.72 | 71.01 | 26.92 | 37.32 | 67.01 | 1.06 | LlamaForCausalLM |
| amber_fine_tune_ori | Fine Tuned Models | 67.4 | 40.83 | 44.45 | 75.1 | 26.04 | 34.94 | 63.14 | 1.29 | LlamaForCausalLM |
| GPT-R | Fine Tuned Models | 0 | 40.8 | 41.21 | 66.89 | 36.5 | 34.22 | 64.4 | 1.59 | GPTJForCausalLM |
| ShortKing-3b-v0.3 | Fine Tuned Models | 34.3 | 40.8 | 40.96 | 70.72 | 26.21 | 38.78 | 66.93 | 1.21 | Unknown |
| oasst-sft-1-pythia-12b | Fine Tuned Models | 120 | 40.77 | 46.42 | 70.0 | 26.19 | 39.19 | 62.19 | 0.61 | GPTNeoXForCausalLM |
| gogpt-7b-bloom | Fine Tuned Models | 70 | 40.75 | 44.62 | 62.56 | 33.81 | 40.61 | 62.9 | 0.0 | BloomForCausalLM |
| gpt-sw3-20b | Pretrained Models | 209.2 | 40.71 | 41.81 | 68.75 | 28.47 | 37.1 | 67.17 | 0.99 | GPT2LMHeadModel |
| chatml-pyg-v1 | Fine Tuned Models | 0 | 40.7 | 37.88 | 63.29 | 32.77 | 42.61 | 62.51 | 5.16 | GPTJForCausalLM |
| h2ogpt-gm-oasst1-en-1024-12b | Fine Tuned Models | 120 | 40.65 | 43.09 | 69.75 | 25.87 | 38.0 | 66.14 | 1.06 | GPTNeoXForCausalLM |
| open-llama-3b-everythingLM-2048 | Fine Tuned Models | 34.3 | 40.62 | 42.75 | 71.72 | 27.16 | 34.26 | 66.3 | 1.52 | LlamaForCausalLM |
| Javalion-R | Fine Tuned Models | 0 | 40.51 | 41.72 | 68.02 | 30.81 | 34.44 | 65.43 | 2.65 | GPTJForCausalLM |
| h2ogpt-oasst1-512-12b | Fine Tuned Models | 120 | 40.48 | 42.32 | 70.24 | 26.01 | 36.41 | 66.22 | 1.67 | GPTNeoXForCausalLM |
| ThetaWave-28B-v0.1 | Pretrained Models | 281.8 | 40.4 | 36.6 | 35.54 | 54.5 | 49.86 | 65.9 | 0.0 | MistralForCausalLM |
| Javelin-R | Fine Tuned Models | 0 | 40.39 | 41.64 | 69.01 | 30.7 | 34.5 | 64.8 | 1.67 | GPTJForCausalLM |
| WizardCoder-Python-7B-V1.0 | Fine Tuned Models | 70 | 40.32 | 41.81 | 65.06 | 32.29 | 36.32 | 61.72 | 4.7 | LlamaForCausalLM |
| smartyplats-3b-v2 | Chat Models | 30 | 40.29 | 41.04 | 71.19 | 24.32 | 36.66 | 66.93 | 1.59 | LlamaForCausalLM |
| open_llama_3b_v2 | Pretrained Models | 30 | 40.28 | 40.27 | 71.6 | 27.12 | 34.78 | 67.01 | 0.91 | LlamaForCausalLM |
| openllama_3b_EvolInstruct_lora_merged | Chat Models | 30 | 40.28 | 40.27 | 71.6 | 27.12 | 34.78 | 67.01 | 0.91 | LlamaForCausalLM |
| CodeLlama-34B-Python-fp16 | Chat Models | 337.4 | 40.27 | 38.14 | 34.8 | 32.95 | 43.57 | 72.14 | 20.02 | LlamaForCausalLM |
| CodeLlama-34b-Python-hf | Chat Models | 337.4 | 40.27 | 40.19 | 36.82 | 34.79 | 44.28 | 71.19 | 14.33 | LlamaForCausalLM |
| open-llama-3b-v2-layla | Fine Tuned Models | 30 | 40.25 | 38.23 | 66.43 | 28.56 | 44.4 | 62.83 | 1.06 | LlamaForCausalLM |
| Javelin-GPTJ | Fine Tuned Models | 0 | 40.23 | 42.66 | 70.45 | 26.2 | 36.08 | 64.17 | 1.82 | GPTJForCausalLM |
| tora-code-7b-v1.0 | Chat Models | 70 | 40.21 | 40.7 | 65.86 | 33.34 | 34.84 | 61.56 | 4.93 | LlamaForCausalLM |
| Janin-R | Fine Tuned Models | 0 | 40.19 | 40.44 | 67.36 | 31.24 | 34.49 | 65.35 | 2.27 | GPTJForCausalLM |
| Bean-3B | Fine Tuned Models | 34.3 | 40.18 | 40.36 | 72.0 | 26.43 | 36.11 | 65.67 | 0.53 | LlamaForCausalLM |
| gpt-j-6b | Pretrained Models | 60 | 40.1 | 41.38 | 67.54 | 26.78 | 35.96 | 65.98 | 2.96 | GPTJForCausalLM |
| calypso-3b-alpha-v2 | Fine Tuned Models | 30 | 40.09 | 41.55 | 71.48 | 25.82 | 35.73 | 65.27 | 0.68 | LlamaForCausalLM |
| CodeBarcenas-7b | Fine Tuned Models | 70 | 40.09 | 42.32 | 63.43 | 33.39 | 38.51 | 60.38 | 2.5 | LlamaForCausalLM |
| opt-13b | Pretrained Models | 130 | 40.06 | 39.93 | 71.2 | 24.9 | 34.1 | 68.51 | 1.74 | OPTForCausalLM |
| CodeLlama-7b-Instruct-hf | Chat Models | 67.4 | 40.05 | 36.52 | 55.44 | 34.54 | 41.25 | 64.56 | 7.96 | LlamaForCausalLM |
| DopeyTinyLlama-1.1B-v1 | Fine Tuned Models | 11 | 40.04 | 38.4 | 63.49 | 25.76 | 37.36 | 73.4 | 1.82 | LlamaForCausalLM |
| speechless-tools-7b | Fine Tuned Models | 70 | 40.0 | 38.91 | 57.69 | 33.24 | 44.08 | 58.56 | 7.51 | LlamaForCausalLM |