加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| KoRWKV-6B | Pretrained Models | 65.3 | 28.19 | 22.1 | 32.18 | 24.69 | 39.05 | 51.14 | 0.0 | RwkvForCausalLM |
| code_gpt2 | Fine Tuned Models | 1.2 | 28.19 | 23.29 | 30.99 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 25.03 |
| 40.6 |
| 49.25 |
| 0.0 |
| GPT2LMHeadModel |
| TinyMistral-248M-Instruct | Chat Models | 2.5 | 28.19 | 24.32 | 27.52 | 25.18 | 41.94 | 50.2 | 0.0 | MistralForCausalLM |
| distilgpt2-HC3 | Fine Tuned Models | 0.9 | 28.18 | 24.66 | 27.99 | 23.95 | 42.1 | 50.36 | 0.0 | GPT2LMHeadModel |
| gpt2-dolly | Chat Models | 1.2 | 28.18 | 21.76 | 30.77 | 24.66 | 42.22 | 49.57 | 0.08 | GPT2LMHeadModel |
| smol_llama-81M-tied | Pretrained Models | 0.8 | 28.17 | 22.18 | 29.33 | 24.06 | 43.97 | 49.25 | 0.23 | LlamaForCausalLM |
| math_gpt2_sft | Fine Tuned Models | 1.2 | 28.03 | 22.87 | 30.41 | 25.06 | 37.62 | 51.54 | 0.68 | GPT2LMHeadModel |
| Med_GPT2 | Fine Tuned Models | 1.2 | 28.02 | 23.38 | 30.99 | 24.0 | 38.95 | 49.72 | 1.06 | GPT2LMHeadModel |
| LaMini-GPT-124M | Fine Tuned Models | 1.2 | 28.01 | 24.32 | 30.82 | 24.99 | 36.57 | 51.38 | 0.0 | GPT2LMHeadModel |
| chat_gpt2 | Fine Tuned Models | 0 | 27.99 | 23.04 | 30.76 | 24.39 | 39.81 | 49.96 | 0.0 | GPT2LMHeadModel |
| tinylamma-20000 | Fine Tuned Models | 11 | 27.95 | 23.81 | 32.45 | 25.37 | 34.87 | 51.22 | 0.0 | LlamaForCausalLM |
| gpt3-finnish-small | Pretrained Models | 0 | 27.95 | 20.48 | 28.09 | 24.47 | 46.47 | 48.22 | 0.0 | BloomModel |
| TinyMistral-6x248M-Instruct | Chat Models | 10 | 27.89 | 22.44 | 27.02 | 24.13 | 43.16 | 50.59 | 0.0 | MixtralForCausalLM |
| xuanxuan | Fine Tuned Models | 1.4 | 27.88 | 23.46 | 31.12 | 26.27 | 35.97 | 50.43 | 0.0 | GPT2LMHeadModel |
| gpt2-alpaca | Fine Tuned Models | 1.4 | 27.86 | 22.87 | 31.14 | 26.26 | 36.22 | 50.67 | 0.0 | GPT2LMHeadModel |
| dlite-v1-124m | Fine Tuned Models | 1.2 | 27.86 | 24.32 | 31.16 | 25.08 | 36.38 | 50.2 | 0.0 | GPT2LMHeadModel |
| kogpt | Fine Tuned Models | 3.9 | 27.83 | 21.16 | 28.11 | 26.56 | 42.06 | 49.09 | 0.0 | GPT2LMHeadModel |
| Cerebras-GPT-111M | Pretrained Models | 1.1 | 27.75 | 20.22 | 26.73 | 25.51 | 46.31 | 47.75 | 0.0 | ? |
| TinyMistral-248m | Pretrained Models | 2.5 | 27.73 | 22.87 | 28.02 | 23.15 | 42.52 | 49.8 | 0.0 | Unknown |
| mGPT | Pretrained Models | 0 | 27.61 | 23.81 | 26.37 | 25.17 | 39.62 | 50.67 | 0.0 | GPT2LMHeadModel |
| testmodel | Fine Tuned Models | 1.5 | 27.6 | 19.71 | 26.68 | 25.28 | 43.72 | 50.2 | 0.0 | GPT2LMHeadModel |
| 111m | Fine Tuned Models | 1.5 | 27.6 | 19.71 | 26.68 | 25.28 | 43.72 | 50.2 | 0.0 | GPT2LMHeadModel |
| TinyMistral-248M-SFT-v3 | Chat Models | 2.5 | 27.45 | 21.93 | 28.26 | 22.91 | 40.03 | 51.54 | 0.0 | Unknown |
| dolly-v2-3b | Fine Tuned Models | 30 | 22.83 | 25.26 | 26.55 | 24.7 | 0.0 | 59.43 | 1.06 | GPTNeoXForCausalLM |
| v1olet_marcoroni-go-bruins-7B | Fine Tuned Models | 70 | 22.43 | 29.1 | 28.3 | 25.09 | 0.0 | 52.09 | 0.0 | Unknown |
| v1olet_mistral_7B | Chat Models | 70 | 22.16 | 29.18 | 28.13 | 26.24 | 0.0 | 49.41 | 0.0 | Unknown |
| mistral-class-bio-tutor | Fine Tuned Models | 71.1 | 21.59 | 28.07 | 28.02 | 23.79 | 0.0 | 49.64 | 0.0 | Unknown |
| llama-2-13b-rockwellautomation | Fine Tuned Models | 130.2 | 21.48 | 28.16 | 25.77 | 25.14 | 0.0 | 49.8 | 0.0 | LlamaForCausalLM |
| bloom-560m-finetuned-fraud | Unkown Model Types | 5.6 | 21.37 | 26.96 | 28.87 | 24.03 | 0.0 | 48.38 | 0.0 | BloomForCausalLM |
| alignment-handbook-zephyr-7b_ppostep_100 | Fine Tuned Models | 72.4 | 21.3 | 29.27 | 25.87 | 23.76 | 0.0 | 48.93 | 0.0 | MistralForCausalLM |
| YetAnother_Open-Llama-3B-LoRA-OpenOrca | Fine Tuned Models | 34.3 | 21.2 | 25.94 | 25.76 | 24.65 | 0.0 | 50.83 | 0.0 | LlamaForCausalLM |
| Dante-2.8B | Unkown Model Types | 28 | 21.12 | 25.09 | 26.05 | 24.51 | 0.0 | 51.07 | 0.0 | GPTNeoXForCausalLM |
| mptk-1b | Pretrained Models | 13.1 | 20.84 | 22.7 | 25.48 | 27.11 | 0.0 | 49.72 | 0.0 | MptForCausalLM |
| mindy-7b | Fine Tuned Models | 72.4 | 20.52 | 23.63 | 25.82 | 24.15 | 0.0 | 49.49 | 0.0 | Unknown |
| test | Fine Tuned Models | 107.3 | 20.45 | 23.04 | 25.23 | 23.28 | 0.0 | 51.14 | 0.0 | LlamaForCausalLM |
| zen | Fine Tuned Models | 72.4 | 20.33 | 23.98 | 25.08 | 23.26 | 0.0 | 49.64 | 0.0 | MistralForCausalLM |
| test_wanda_240109 | Fine Tuned Models | 107.3 | 20.24 | 22.95 | 25.26 | 23.32 | 0.0 | 49.88 | 0.0 | LlamaForCausalLM |
| Sakura-SOLAR-Instruct-DPO-v1 | Chat Models | 107.3 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | Unknown |
| speechless-mistral-six-in-one-7b-orth-1.0 | Fine Tuned Models | 70 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | MistralForCausalLM |
| mpt-125m-c4 | Pretrained Models | 1.2 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | MPTForCausalLM |
| stablelm_sft_dpo | Fine Tuned Models | 78.7 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | GPTNeoXForCausalLM |
| caigun-lora-model-33B | Fine Tuned Models | 182.5 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | LlamaForCausalLM |
| moe_scratch | Fine Tuned Models | 467 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | MixtralForCausalLM |
| mistral-moe-scratch | Fine Tuned Models | 467 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | Unknown |
| Panther_v1 | Unkown Model Types | 0 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | LLaMAForCausalLM |
| Llama-2-ft-instruct-es | Fine Tuned Models | 0 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | LlamaForCausalLM |
| llama-2-13b-dolphin-peft | Fine Tuned Models | 130 | 20.07 | 22.7 | 25.04 | 23.12 | 0.0 | 49.57 | 0.0 | Unknown |
| Pythia-31M-Chat-v1 | Chat Models | 0.3 | 19.92 | 22.7 | 25.6 | 23.24 | 0.0 | 47.99 | 0.0 | GPTNeoXForCausalLM |