加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| wizard-mega-13B-GPTQ | Unkown Model Types | 162.2 | 35.18 | 27.73 | 26.01 | 24.97 | 48.69 | 74.74 | 8.95 | LlamaForCausalLM |
| chronos-wizardlm-uc-scot-st-13B-GPTQ | Unkown Model Types | 162.2 | 35.15 | 27.99 | 26.1 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 25.72 |
| 49.68 |
| 74.51 |
| 6.9 |
| LlamaForCausalLM |
| TinyWand-DPO | Chat Models | 16.3 | 35.13 | 31.66 | 50.42 | 26.22 | 45.8 | 54.78 | 1.9 | LlamaForCausalLM |
| pythia-1.4b-deduped-sharegpt | Fine Tuned Models | 14.2 | 35.11 | 34.3 | 54.49 | 24.0 | 41.81 | 55.25 | 0.83 | GPTNeoXForCausalLM |
| wangchanglm-7.5B-sft-enth | Fine Tuned Models | 75 | 35.11 | 33.79 | 58.99 | 24.52 | 34.9 | 57.93 | 0.53 | XGLMForCausalLM |
| metharme-1.3b | Fine Tuned Models | 15.2 | 35.04 | 34.39 | 55.94 | 25.07 | 37.68 | 56.43 | 0.76 | GPTNeoXForCausalLM |
| falcon-1b-t-sft | Chat Models | 13.1 | 35.02 | 32.94 | 57.24 | 25.26 | 38.49 | 55.88 | 0.3 | FalconForCausalLM |
| LLmRa-1.3B | Fine Tuned Models | 13.1 | 35.0 | 32.68 | 58.77 | 23.23 | 36.21 | 59.04 | 0.08 | XGLMForCausalLM |
| pythia-1.4b-deduped | Pretrained Models | 14 | 35.0 | 32.68 | 54.96 | 25.56 | 38.66 | 57.3 | 0.83 | GPTNeoXForCausalLM |
| TinyLlama-1.1B-intermediate-step-715k-1.5T-lr-5-1epch-airoboros3.1-1k-instruct-V1 | Unkown Model Types | 11 | 34.98 | 30.72 | 54.32 | 24.78 | 41.67 | 57.62 | 0.76 | LlamaForCausalLM |
| falcon_1b_stage3 | Fine Tuned Models | 10 | 34.95 | 33.11 | 54.08 | 25.11 | 37.92 | 59.51 | 0.0 | FalconForCausalLM |
| TinyLlama-1.1B-Chat-v0.6 | Unkown Model Types | 11 | 34.94 | 31.66 | 55.79 | 25.98 | 34.72 | 59.35 | 2.12 | LlamaForCausalLM |
| TinyLlama-1.1B-Remix-V.2 | Fine Tuned Models | 11 | 34.91 | 33.19 | 56.62 | 25.99 | 34.64 | 58.09 | 0.91 | LlamaForCausalLM |
| Tiny-Vicuna-1B | Chat Models | 11 | 34.76 | 33.45 | 55.92 | 25.45 | 33.82 | 58.41 | 1.52 | LlamaForCausalLM |
| megachat | Chat Models | 0 | 34.75 | 30.8 | 54.35 | 25.55 | 39.85 | 56.99 | 0.99 | LlamaForCausalLM |
| lamini-neo-1.3b | Fine Tuned Models | 13.2 | 34.73 | 32.76 | 49.13 | 28.79 | 41.05 | 56.51 | 0.15 | Unknown |
| LaMini-GPT-1.5B | Fine Tuned Models | 15 | 34.67 | 31.4 | 48.38 | 29.92 | 42.47 | 55.88 | 0.0 | GPT2LMHeadModel |
| WizardCoder-15B-V1.0 | Fine Tuned Models | 150 | 34.64 | 32.34 | 47.2 | 29.43 | 41.56 | 55.17 | 2.12 | GPTBigCodeForCausalLM |
| TinyWand-SFT | Chat Models | 16.3 | 34.61 | 31.4 | 49.96 | 25.98 | 43.08 | 55.17 | 2.05 | LlamaForCausalLM |
| opt-1.3b | Unkown Model Types | 13 | 34.6 | 29.52 | 54.53 | 24.96 | 38.71 | 59.75 | 0.15 | OPTForCausalLM |
| TinyLlama-1.1B-Chat-v0.1 | Fine Tuned Models | 11 | 34.57 | 32.0 | 54.21 | 26.71 | 39.03 | 54.93 | 0.53 | Unknown |
| TinyLlama-1.1B-intermediate-step-955k-token-2T | Unkown Model Types | 11 | 34.56 | 30.29 | 54.84 | 26.47 | 36.07 | 58.33 | 1.36 | LlamaForCausalLM |
| gpt-sw3-1.3b-instruct | Chat Models | 14.4 | 34.54 | 30.97 | 51.42 | 26.17 | 40.31 | 56.75 | 1.59 | GPT2LMHeadModel |
| TinyLlama-1.1B-step-2T-lr-5-5ep-oasst1-top1-instruct-V1 | Unkown Model Types | 11 | 34.53 | 31.06 | 55.02 | 26.41 | 35.08 | 58.01 | 1.59 | LlamaForCausalLM |
| tinyllama-1.1b-chat-v0.3_platypus | Chat Models | 11 | 34.5 | 30.29 | 55.12 | 26.13 | 39.15 | 55.8 | 0.53 | LlamaForCausalLM |
| pythia-1.3b | Pretrained Models | 13.1 | 34.46 | 31.14 | 51.43 | 26.55 | 39.24 | 57.38 | 0.99 | Unknown |
| PULI-GPTrio | Pretrained Models | 0 | 34.42 | 30.72 | 53.49 | 24.73 | 39.03 | 57.77 | 0.76 | GPTNeoXForCausalLM |
| TinyLlama-1.1B-intermediate-step-480k-1T | Pretrained Models | 10.3 | 34.37 | 30.89 | 52.97 | 25.0 | 39.55 | 57.3 | 0.53 | Unknown |
| EverythingLM-13B-16K-GPTQ | Fine Tuned Models | 162.3 | 34.37 | 29.27 | 26.24 | 25.4 | 48.58 | 71.35 | 5.38 | LlamaForCausalLM |
| stablelm-base-alpha-7b | Pretrained Models | 70 | 34.37 | 32.0 | 51.78 | 26.21 | 40.19 | 55.41 | 0.61 | GPTNeoXForCausalLM |
| h2ogpt-gm-oasst1-en-2048-open-llama-7b-preview-300bt | Fine Tuned Models | 70 | 34.32 | 34.04 | 50.51 | 24.66 | 41.8 | 54.93 | 0.0 | LlamaForCausalLM |
| xglm-4.5B | Pretrained Models | 50.8 | 34.31 | 31.48 | 57.95 | 25.43 | 35.84 | 54.93 | 0.23 | XGLMForCausalLM |
| gpt-sw3-1.3b | Pretrained Models | 14.4 | 34.31 | 30.38 | 50.4 | 26.14 | 39.97 | 58.88 | 0.08 | GPT2LMHeadModel |
| LLmRa-1.3B_V2 | Fine Tuned Models | 13.2 | 34.21 | 30.46 | 53.03 | 26.06 | 36.46 | 59.27 | 0.0 | OPTForCausalLM |
| dlite-v2-1_5b | Fine Tuned Models | 50 | 34.2 | 32.59 | 53.98 | 24.93 | 38.77 | 54.7 | 0.23 | GPT2LMHeadModel |
| WizardCoder-Guanaco-15B-V1.1 | Fine Tuned Models | 150 | 34.19 | 32.59 | 45.42 | 25.88 | 42.33 | 56.04 | 2.88 | GPTBigCodeForCausalLM |
| starcoder-gpteacher-code-instruct | Fine Tuned Models | 0 | 34.15 | 32.68 | 47.6 | 28.63 | 40.41 | 55.56 | 0.0 | GPTBigCodeForCausalLM |
| gpt2-xl_lima | Chat Models | 15.6 | 34.12 | 31.14 | 51.28 | 25.43 | 38.74 | 57.22 | 0.91 | GPT2LMHeadModel |
| Walter-Falcon-1B | Chat Models | 13.1 | 34.07 | 31.06 | 54.92 | 24.58 | 38.47 | 55.41 | 0.0 | FalconForCausalLM |
| TinyLlama-1.1B-2T-lr-2e-4-3ep-dolly-15k-instruct-v1 | Fine Tuned Models | 11 | 34.04 | 30.55 | 53.7 | 26.07 | 35.85 | 58.09 | 0.0 | LlamaForCausalLM |
| stablelm-tuned-alpha-7b | Fine Tuned Models | 70 | 34.04 | 31.91 | 53.59 | 24.41 | 40.37 | 53.12 | 0.83 | GPTNeoXForCausalLM |
| TinyLlama-Remix | Fine Tuned Models | 11 | 34.0 | 31.14 | 49.5 | 27.34 | 40.53 | 55.41 | 0.08 | LlamaForCausalLM |
| bloom-1b7 | Unkown Model Types | 17.2 | 33.98 | 30.63 | 47.6 | 27.48 | 41.31 | 56.04 | 0.83 | BloomForCausalLM |
| pygmalion-2.7b | Fine Tuned Models | 27 | 33.98 | 32.76 | 54.13 | 23.28 | 37.17 | 56.51 | 0.0 | GPTNeoForCausalLM |
| WizardCoder-Guanaco-15B-V1.0 | Fine Tuned Models | 150 | 33.96 | 30.46 | 45.59 | 26.79 | 46.39 | 53.12 | 1.44 | GPTBigCodeForCausalLM |
| gogpt-3b-bloom | Fine Tuned Models | 30 | 33.96 | 31.91 | 50.32 | 25.2 | 41.79 | 54.38 | 0.15 | BloomForCausalLM |
| gpt-2-xl_camel-ai-physics | Chat Models | 15.6 | 33.96 | 29.52 | 50.62 | 26.79 | 39.12 | 57.54 | 0.15 | GPT2LMHeadModel |
| WizardLM-Uncensored-SuperCOT-StoryTelling-30B-GPTQ | Unkown Model Types | 355.8 | 33.78 | 28.41 | 26.05 | 24.71 | 49.54 | 68.67 | 5.31 | LlamaForCausalLM |
| TinyLlama-1.1B-intermediate-step-240k-503b | Pretrained Models | 11 | 33.72 | 29.27 | 49.71 | 26.26 | 40.17 | 56.59 | 0.3 | Unknown |
| gpt-neo-1.3B | Pretrained Models | 13.7 | 33.58 | 31.23 | 48.47 | 24.82 | 39.63 | 56.91 | 0.45 | GPTNeoForCausalLM |
| Cerebras-GPT-2.7B-Alpaca-SP | Fine Tuned Models | 27 | 33.5 | 30.8 | 48.88 | 25.12 | 40.24 | 55.41 | 0.53 | GPT2LMHeadModel |
| gpt-neo-1.3B-emailgen | Fine Tuned Models | 13 | 33.47 | 29.95 | 47.95 | 24.11 | 42.55 | 56.27 | 0.0 | GPTNeoForCausalLM |
| TinyLlama-1.1bee | Fine Tuned Models | 11 | 33.38 | 30.55 | 51.8 | 24.25 | 39.01 | 54.46 | 0.23 | LlamaForCausalLM |
| llama2-3b-distilled-layla-v1 | Unkown Model Types | 30 | 33.36 | 30.46 | 46.05 | 23.91 | 42.14 | 57.38 | 0.23 | Unknown |
| dlite-v1-1_5b | Fine Tuned Models | 50 | 33.35 | 31.66 | 49.69 | 25.62 | 37.08 | 55.96 | 0.08 | GPT2LMHeadModel |
| polyglot-ko-12.8b | Pretrained Models | 130.6 | 33.33 | 27.05 | 51.68 | 26.64 | 34.69 | 59.75 | 0.15 | GPTNeoXForCausalLM |
| gpt2-xl-sft | Fine Tuned Models | 0 | 33.31 | 30.03 | 49.17 | 25.56 | 38.78 | 55.56 | 0.76 | GPT2LMHeadModel |
| Quokka_2.7b | Fine Tuned Models | 27.9 | 33.26 | 31.06 | 47.72 | 24.8 | 40.14 | 55.49 | 0.38 | GPT2LMHeadModel |
| Cerebras-GPT-2.7B | Pretrained Models | 27 | 33.25 | 29.1 | 49.29 | 25.17 | 41.37 | 54.14 | 0.45 | ? |
| SparseOPT-1.3B | Unkown Model Types | 13.2 | 33.19 | 27.13 | 48.69 | 25.6 | 39.11 | 58.56 | 0.08 | Unknown |
| gpt3-finnish-13B | Pretrained Models | 130 | 32.95 | 24.66 | 46.76 | 23.49 | 44.47 | 58.01 | 0.3 | BloomModel |
| dlite-v2-774m | Fine Tuned Models | 7.7 | 32.86 | 30.12 | 47.68 | 25.37 | 40.0 | 53.99 | 0.0 | GPT2LMHeadModel |
| pythia-1b-deduped | Pretrained Models | 10.8 | 32.78 | 29.1 | 49.65 | 24.27 | 38.94 | 53.59 | 1.14 | GPTNeoXForCausalLM |
| RWKV-4-PilePlus-1B5-20230520-2942-486Gtokens-ctx4096 | Fine Tuned Models | 14.1 | 32.68 | 30.63 | 52.63 | 25.04 | 34.96 | 52.8 | 0.0 | Unknown |
| gpt-neo-1.3B-4bit-alpaca | Fine Tuned Models | 13 | 32.58 | 28.24 | 46.35 | 25.19 | 39.26 | 56.2 | 0.23 | Unknown |
| Alpaca_spin_gpt2_e1_se0 | Fine Tuned Models | 7.7 | 32.5 | 27.99 | 45.74 | 26.68 | 39.06 | 55.56 | 0.0 | GPT2LMHeadModel |
| bloom-1b1 | Unkown Model Types | 10.6 | 32.47 | 28.33 | 42.78 | 26.7 | 41.8 | 55.01 | 0.23 | BloomForCausalLM |
| bilingual-gpt-neox-4b-instruction-sft | Chat Models | 38 | 32.46 | 28.07 | 47.5 | 23.12 | 43.76 | 52.33 | 0.0 | GPTNeoXForCausalLM |
| Alpaca_spin_tuned_gpt2_large | Fine Tuned Models | 7.7 | 32.46 | 27.9 | 45.12 | 27.08 | 39.43 | 54.62 | 0.61 | GPT2LMHeadModel |
| LaMini-GPT-774M | Unkown Model Types | 7.7 | 32.43 | 27.65 | 43.81 | 26.3 | 40.26 | 56.59 | 0.0 | GPT2LMHeadModel |
| codegen-6B-multi | Pretrained Models | 60 | 32.43 | 27.22 | 41.11 | 25.71 | 45.65 | 53.91 | 0.99 | CodeGenForCausalLM |
| deepseek-coder-1.3b-instruct | Chat Models | 13 | 32.4 | 28.58 | 39.87 | 28.47 | 44.02 | 52.41 | 1.06 | LlamaForCausalLM |
| Alpaca_spin_gpt2_e0_se1 | Fine Tuned Models | 7.7 | 32.4 | 27.99 | 45.84 | 26.44 | 38.88 | 55.17 | 0.08 | GPT2LMHeadModel |
| Alpaca_refine_gpt2_e0_se1 | Fine Tuned Models | 7.7 | 32.39 | 29.18 | 45.35 | 26.91 | 37.89 | 54.3 | 0.68 | GPT2LMHeadModel |
| gpt2-large-conversational | Chat Models | 7.7 | 32.33 | 26.96 | 44.98 | 26.33 | 39.6 | 56.04 | 0.08 | GPT2LMHeadModel |
| FLOR-1.3B-xat | Chat Models | 13.1 | 32.27 | 26.79 | 41.63 | 26.65 | 44.38 | 53.43 | 0.76 | BloomForCausalLM |
| bilingual-gpt-neox-4b-8k | Pretrained Models | 39.5 | 32.23 | 28.58 | 43.94 | 25.38 | 47.48 | 47.99 | 0.0 | GPTNeoXForCausalLM |
| Alpaca_refine_tuned_gpt2_large | Fine Tuned Models | 7.7 | 32.19 | 27.56 | 45.09 | 26.91 | 37.91 | 54.93 | 0.76 | GPT2LMHeadModel |
| bilingual-gpt-neox-4b | Pretrained Models | 39.5 | 32.14 | 29.18 | 43.73 | 23.1 | 45.0 | 51.85 | 0.0 | GPTNeoXForCausalLM |
| stablelm-tuned-alpha-3b | Fine Tuned Models | 30 | 32.14 | 27.82 | 44.06 | 23.08 | 42.33 | 55.01 | 0.53 | GPTNeoXForCausalLM |
| Medical-ChatBot | Fine Tuned Models | 0 | 32.13 | 30.55 | 38.63 | 25.98 | 41.25 | 55.41 | 0.99 | GPT2LMHeadModel |
| Alpaca_refine_gpt2_e1_se0 | Fine Tuned Models | 7.7 | 32.06 | 27.3 | 45.39 | 26.51 | 37.28 | 55.88 | 0.0 | GPT2LMHeadModel |
| Alpaca-tuned-gpt2 | Fine Tuned Models | 7.7 | 32.02 | 26.54 | 44.79 | 27.22 | 37.65 | 55.09 | 0.83 | GPT2LMHeadModel |
| Medical-ChatBot | Fine Tuned Models | 0 | 31.98 | 30.46 | 38.6 | 25.96 | 41.04 | 54.85 | 0.99 | GPT2LMHeadModel |
| SSH_355M | Fine Tuned Models | 3.6 | 31.92 | 26.96 | 38.98 | 27.59 | 44.15 | 53.83 | 0.0 | GPT2LMHeadModel |
| Medical-ChatBot | Fine Tuned Models | 0 | 31.87 | 30.46 | 38.55 | 25.91 | 41.02 | 54.22 | 1.06 | GPT2LMHeadModel |
| polyglot-ko-3.8b-total | Fine Tuned Models | 38 | 31.87 | 25.34 | 39.69 | 29.16 | 43.67 | 53.35 | 0.0 | GPTNeoXForCausalLM |
| TinyLlama-1.1B-step-50K-105b | Pretrained Models | 11 | 31.86 | 25.85 | 44.1 | 26.78 | 39.51 | 54.38 | 0.53 | Unknown |
| deepseek-coder-1.3b-chat-and-function-calling | Fine Tuned Models | 13.5 | 31.82 | 26.28 | 39.27 | 26.92 | 43.37 | 51.7 | 3.41 | LlamaForCausalLM |
| gpt2-large-lora-sft | Fine Tuned Models | 7.7 | 31.82 | 26.79 | 44.15 | 25.82 | 39.06 | 55.09 | 0.0 | GPT2LMHeadModel |
| llm-jp-13b-instruct-full-jaster-dolly-oasst-v1.0 | Chat Models | 130 | 31.77 | 26.88 | 44.78 | 23.12 | 45.19 | 50.67 | 0.0 | GPT2LMHeadModel |
| deepseek-coder-1.3b-chat | Fine Tuned Models | 13.5 | 31.74 | 25.85 | 39.59 | 26.36 | 43.92 | 51.7 | 3.03 | LlamaForCausalLM |
| orca_mini_13B-GPTQ | Unkown Model Types | 162.2 | 31.73 | 27.3 | 25.85 | 25.31 | 48.06 | 63.77 | 0.08 | LlamaForCausalLM |
| llm-jp-13b-instruct-full-jaster-v1.0 | Chat Models | 130 | 31.63 | 27.22 | 44.7 | 23.12 | 44.69 | 50.04 | 0.0 | GPT2LMHeadModel |
| deepseek-coder-1.3b-chat | Fine Tuned Models | 13.5 | 31.57 | 25.6 | 39.69 | 25.54 | 43.94 | 51.46 | 3.18 | LlamaForCausalLM |
| pythia-410m | Pretrained Models | 5.1 | 31.55 | 26.19 | 40.85 | 27.25 | 41.22 | 53.12 | 0.68 | GPTNeoXForCausalLM |
| dlite-v1-774m | Fine Tuned Models | 7.7 | 31.51 | 28.07 | 44.35 | 25.91 | 36.11 | 54.62 | 0.0 | GPT2LMHeadModel |
| stablelm-base-alpha-3b | Pretrained Models | 30 | 31.5 | 26.45 | 42.24 | 25.43 | 40.5 | 53.91 | 0.45 | GPTNeoXForCausalLM |
| Instruct_GPT | Fine Tuned Models | 0 | 31.46 | 28.24 | 39.33 | 26.84 | 39.72 | 54.3 | 0.3 | GPT2LMHeadModel |
| xglm-1.7B | Unkown Model Types | 17 | 31.42 | 25.85 | 45.68 | 25.1 | 37.21 | 53.91 | 0.76 | XGLMForCausalLM |