加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| DistiLabelOrca-TinyLLama-1.1B | Fine Tuned Models | 11 | 37.17 | 36.18 | 61.15 | 25.09 | 38.05 | 60.85 | 1.67 | LlamaForCausalLM |
| TinyLlama-1.1B-2.5T-chat-and-function-calling | Fine Tuned Models | 11 | 37.16 | 34.39 | 59.61 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 26.32 |
| 38.92 |
| 61.96 |
| 1.74 |
| LlamaForCausalLM |
| lamatama | Fine Tuned Models | 11 | 37.15 | 36.35 | 61.12 | 24.72 | 37.67 | 60.77 | 2.27 | LlamaForCausalLM |
| Sheared-LLaMA-1.3B-ShareGPT | Fine Tuned Models | 13 | 37.14 | 33.96 | 62.55 | 26.42 | 43.03 | 56.83 | 0.08 | LlamaForCausalLM |
| Barcenas-Tiny-1.1b-DPO | Fine Tuned Models | 11 | 37.12 | 36.26 | 61.2 | 24.83 | 37.45 | 60.93 | 2.05 | LlamaForCausalLM |
| pythia-2.7b | Pretrained Models | 29.1 | 37.09 | 37.37 | 60.74 | 25.86 | 35.4 | 62.12 | 1.06 | Unknown |
| TinyLlama-repeat | Fine Tuned Models | 11 | 37.09 | 35.24 | 60.25 | 26.07 | 38.78 | 60.46 | 1.74 | LlamaForCausalLM |
| TinyLlama-1.1B-Chat-v1.0-intel-dpo | Chat Models | 11 | 37.09 | 35.84 | 61.29 | 25.05 | 37.38 | 61.01 | 1.97 | LlamaForCausalLM |
| falcon-rw-1b | Pretrained Models | 10 | 37.07 | 35.07 | 63.56 | 25.28 | 35.96 | 62.04 | 0.53 | FalconForCausalLM |
| Phind-CodeLlama-34B-v1 | Fine Tuned Models | 340 | 37.06 | 27.13 | 28.28 | 28.94 | 44.94 | 72.61 | 20.47 | LlamaForCausalLM |
| OpenHermes-2.5-FLOR-6.3B | Fine Tuned Models | 63 | 37.04 | 33.45 | 54.53 | 25.18 | 46.12 | 62.98 | 0.0 | BloomForCausalLM |
| TinyLlama-Cinder-1.3B-Test.2 | Fine Tuned Models | 12.8 | 37.04 | 33.7 | 58.66 | 25.69 | 37.98 | 64.09 | 2.12 | LlamaForCausalLM |
| bloomz-3b | Unkown Model Types | 30 | 37.03 | 36.86 | 54.95 | 32.91 | 40.34 | 57.14 | 0.0 | BloomForCausalLM |
| TinyNaughtyLlama-v1.0 | Chat Models | 11 | 37.03 | 35.92 | 61.04 | 25.82 | 36.77 | 60.22 | 2.43 | LlamaForCausalLM |
| TinyLlama-1.1B-Chat-v1.0-reasoning-v2-dpo | Fine Tuned Models | 11 | 37.03 | 34.39 | 61.87 | 26.34 | 36.13 | 63.46 | 0.0 | LlamaForCausalLM |
| TinyLlama-1.1B-miniguanaco | Fine Tuned Models | 11 | 37.02 | 35.15 | 60.26 | 26.26 | 38.84 | 60.14 | 1.44 | ? |
| CodeLlama-13b-Python-hf | Chat Models | 130.2 | 37.0 | 32.59 | 43.94 | 27.23 | 44.59 | 65.04 | 8.64 | LlamaForCausalLM |
| TinyLlama-1.1B-Chat-v1.0-x2-MoE | Fine Tuned Models | 18.6 | 36.98 | 36.01 | 61.04 | 24.81 | 37.37 | 60.38 | 2.27 | MixtralForCausalLM |
| OPT-2.7B-Erebus | Fine Tuned Models | 27 | 36.96 | 34.39 | 60.91 | 26.7 | 37.82 | 61.64 | 0.3 | OPTForCausalLM |
| LlamaCorn-1.1B | Fine Tuned Models | 11 | 36.94 | 34.13 | 59.33 | 29.01 | 36.78 | 61.96 | 0.45 | LlamaForCausalLM |
| bloomz-3b-sft-chat | Fine Tuned Models | 30 | 36.94 | 36.86 | 54.34 | 31.49 | 39.69 | 58.88 | 0.38 | BloomForCausalLM |
| TinyLlama-1.1B-2.5T-chat | Fine Tuned Models | 11 | 36.93 | 34.47 | 59.71 | 26.45 | 38.8 | 61.01 | 1.14 | LlamaForCausalLM |
| blossom-v1-3b | Fine Tuned Models | 30 | 36.9 | 36.86 | 55.1 | 26.7 | 43.45 | 58.88 | 0.38 | BloomForCausalLM |
| Phind-CodeLlama-34B-v2 | Chat Models | 340 | 36.89 | 24.57 | 27.6 | 25.76 | 48.37 | 71.82 | 23.2 | LlamaForCausalLM |
| CodeLlama-7b-Python-hf | Chat Models | 67.4 | 36.89 | 31.31 | 52.86 | 27.32 | 42.21 | 63.06 | 4.55 | LlamaForCausalLM |
| falcon_1b_stage2 | Fine Tuned Models | 10 | 36.88 | 33.11 | 63.19 | 24.22 | 38.4 | 62.35 | 0.0 | FalconForCausalLM |
| OPT-2.7B-Nerybus-Mix | Fine Tuned Models | 27 | 36.88 | 33.7 | 61.21 | 26.6 | 37.57 | 62.04 | 0.15 | OPTForCausalLM |
| openbuddy-openllama-3b-v10-bf16 | Fine Tuned Models | 30 | 36.87 | 36.26 | 58.38 | 23.89 | 42.04 | 59.67 | 0.99 | LlamaForCausalLM |
| Tinyllama-1.3B-Cinder-Reason-Test-2 | Fine Tuned Models | 12.8 | 36.83 | 32.76 | 57.92 | 25.42 | 37.26 | 64.8 | 2.81 | LlamaForCausalLM |
| tinyllama-1.1b-layla-v1 | Fine Tuned Models | 11 | 36.82 | 34.39 | 59.86 | 24.7 | 41.03 | 59.75 | 1.21 | LlamaForCausalLM |
| camel-5b-hf | Fine Tuned Models | 50 | 36.81 | 35.15 | 57.62 | 26.07 | 40.65 | 61.01 | 0.38 | GPT2LMHeadModel |
| palmer-002 | Fine Tuned Models | 0 | 36.79 | 34.47 | 59.41 | 25.94 | 37.06 | 62.67 | 1.21 | LlamaForCausalLM |
| pythia-2.8b-4bit-alpaca | Fine Tuned Models | 28 | 36.77 | 34.73 | 58.96 | 25.53 | 39.14 | 61.64 | 0.61 | Unknown |
| OPT-2.7B-Nerys-v2 | Fine Tuned Models | 27 | 36.75 | 33.28 | 61.23 | 26.44 | 37.23 | 62.04 | 0.3 | OPTForCausalLM |
| dopeyshearedplats-1.3b-v1 | Chat Models | 13 | 36.74 | 34.39 | 64.31 | 25.4 | 38.21 | 57.38 | 0.76 | LlamaForCausalLM |
| opt-2.7b | Pretrained Models | 27 | 36.74 | 33.96 | 61.43 | 25.43 | 37.43 | 61.96 | 0.23 | OPTForCausalLM |
| LLmRa-2.7B | Fine Tuned Models | 27 | 36.72 | 37.03 | 60.65 | 25.58 | 35.23 | 61.56 | 0.3 | OPTForCausalLM |
| pythia-2.8b-deduped | Pretrained Models | 29.1 | 36.72 | 36.26 | 60.66 | 26.78 | 35.56 | 60.22 | 0.83 | GPTNeoXForCausalLM |
| chopt-2_7b | Fine Tuned Models | 70 | 36.72 | 36.01 | 63.38 | 25.44 | 37.71 | 57.77 | 0.0 | OPTForCausalLM |
| open_llama_3b_600bt_preview | Fine Tuned Models | 34.3 | 36.65 | 36.86 | 59.96 | 25.97 | 32.81 | 63.69 | 0.61 | LlamaForCausalLM |
| 42dot_LLM-SFT-1.3B | Chat Models | 14.4 | 36.61 | 36.09 | 58.96 | 25.51 | 39.98 | 58.41 | 0.68 | LlamaForCausalLM |
| TinyLlama-1.1B-OpenHermes-2.5-Chat-v0.1-sft | Fine Tuned Models | 11 | 36.59 | 33.79 | 58.72 | 24.52 | 36.22 | 60.93 | 5.38 | LlamaForCausalLM |
| Deer-3b | Chat Models | 30 | 36.55 | 38.48 | 57.41 | 25.64 | 39.98 | 57.46 | 0.3 | BloomForCausalLM |
| Tukan-1.1B-Chat-reasoning-sft-COLA | Fine Tuned Models | 11 | 36.53 | 34.13 | 59.78 | 24.86 | 38.25 | 60.77 | 1.36 | LlamaForCausalLM |
| TinyLlama-3T-1.1bee | Fine Tuned Models | 11 | 36.46 | 33.79 | 60.29 | 25.86 | 38.13 | 60.22 | 0.45 | LlamaForCausalLM |
| CodeLlama-7b-Python-hf | Chat Models | 67.4 | 36.42 | 29.27 | 50.12 | 28.37 | 41.61 | 64.01 | 5.16 | LlamaForCausalLM |
| TinyLlama-1.1B-intermediate-step-1431k-3T | Pretrained Models | 11 | 36.42 | 33.87 | 60.31 | 26.04 | 37.32 | 59.51 | 1.44 | LlamaForCausalLM |
| xglm-7.5B | Pretrained Models | 75 | 36.38 | 34.13 | 60.77 | 27.79 | 36.66 | 58.72 | 0.23 | XGLMForCausalLM |
| TinyDolphin-2.8-1.1b | Chat Models | 11 | 36.34 | 34.3 | 59.44 | 25.59 | 36.51 | 60.69 | 1.52 | LlamaForCausalLM |
| Phind-CodeLlama-34B-Python-v1 | Fine Tuned Models | 340 | 36.33 | 24.66 | 29.77 | 27.95 | 45.27 | 68.82 | 21.53 | LlamaForCausalLM |
| Phind-CodeLlama-34B-Python-v1 | Fine Tuned Models | 340 | 36.33 | 24.66 | 29.77 | 27.95 | 45.27 | 68.82 | 21.53 | LlamaForCausalLM |
| Cerebras-GPT-6.7B | Pretrained Models | 67 | 36.27 | 35.07 | 59.36 | 25.93 | 38.02 | 58.72 | 0.53 | ? |
| TinyLlama-1.1B-intermediate-step-1195k-token-2.5T | Unkown Model Types | 11 | 36.26 | 33.53 | 59.38 | 26.22 | 36.79 | 60.22 | 1.44 | LlamaForCausalLM |
| TinyDolphin-2.8.1-1.1b | Chat Models | 11 | 36.21 | 34.98 | 60.11 | 25.31 | 35.51 | 60.69 | 0.68 | LlamaForCausalLM |
| gpt-neo-2.7B | Pretrained Models | 27.2 | 36.2 | 33.36 | 56.24 | 26.45 | 39.78 | 60.06 | 1.29 | GPTNeoForCausalLM |
| bertin-gpt-j-6B-alpaca | Fine Tuned Models | 60 | 36.19 | 36.01 | 54.3 | 27.66 | 43.38 | 55.8 | 0.0 | GPTJForCausalLM |
| falcon_1b_stage3_2 | Fine Tuned Models | 10 | 36.19 | 34.56 | 58.37 | 23.87 | 39.89 | 60.46 | 0.0 | FalconForCausalLM |
| StellarX-4B-V0.2 | Pretrained Models | 40 | 36.15 | 34.64 | 56.74 | 25.55 | 38.55 | 61.4 | 0.0 | GPTNeoXForCausalLM |
| bloom-3b | Pretrained Models | 30 | 36.07 | 35.75 | 54.37 | 26.59 | 40.57 | 57.62 | 1.52 | BloomForCausalLM |
| Wizard-Vicuna-13B-Uncensored-GPTQ | Unkown Model Types | 162.2 | 36.06 | 29.61 | 25.47 | 25.34 | 50.25 | 75.77 | 9.93 | LlamaForCausalLM |
| Deacon-1_8b | Chat Models | 18.4 | 36.03 | 33.7 | 52.33 | 33.97 | 39.05 | 57.14 | 0.0 | LlamaForCausalLM |
| TinyOpenHermes-1.1B-4k | Chat Models | 11 | 35.98 | 33.62 | 58.53 | 26.45 | 37.33 | 59.91 | 0.08 | LlamaForCausalLM |
| blossom-v2-3b | Fine Tuned Models | 30 | 35.98 | 35.32 | 54.1 | 23.99 | 43.11 | 58.8 | 0.53 | BloomForCausalLM |
| shearedplats-1.3b-v1 | Chat Models | 13 | 35.97 | 35.41 | 62.75 | 24.75 | 33.93 | 58.48 | 0.53 | LlamaForCausalLM |
| localmentor_25K_3epochs_tinyllama | Fine Tuned Models | 11 | 35.96 | 34.22 | 59.01 | 24.93 | 36.07 | 60.46 | 1.06 | Unknown |
| Sheared-LLaMA-1.3B | Fine Tuned Models | 13 | 35.95 | 32.85 | 60.91 | 25.71 | 37.14 | 58.64 | 0.45 | LlamaForCausalLM |
| TinyDolphin-2.8.2-1.1b-laser | Chat Models | 11 | 35.93 | 33.36 | 58.53 | 25.93 | 36.33 | 60.14 | 1.29 | LlamaForCausalLM |
| CodeLlama-34b-Python-hf | Fine Tuned Models | 334.8 | 35.92 | 38.05 | 34.79 | 32.96 | 43.57 | 66.14 | 0.0 | Unknown |
| TinyLamma-SFT | Fine Tuned Models | 11 | 35.88 | 34.39 | 59.14 | 24.26 | 37.2 | 58.64 | 1.67 | LlamaForCausalLM |
| opt-flan-iml-6.7b | Chat Models | 66.6 | 35.84 | 30.12 | 58.82 | 25.12 | 36.74 | 64.25 | 0.0 | OPTForCausalLM |
| Tinyllama-1.3B-Cinder-Reason-Test | Fine Tuned Models | 12.8 | 35.84 | 32.51 | 55.85 | 26.61 | 35.59 | 62.12 | 2.35 | LlamaForCausalLM |
| sheared-silicon10p | Merged Models or MoE Models | 27 | 35.82 | 36.18 | 51.12 | 25.56 | 44.85 | 57.22 | 0.0 | LlamaForCausalLM |
| Tinypus-1.5B | Fine Tuned Models | 14.5 | 35.73 | 33.45 | 57.35 | 25.53 | 39.35 | 57.7 | 0.99 | LlamaForCausalLM |
| ShearedLlama-1.3b-FFT-Test1 | Pretrained Models | 13 | 35.71 | 32.68 | 59.99 | 25.69 | 36.97 | 58.72 | 0.23 | LlamaForCausalLM |
| 42dot_LLM-PLM-1.3B | Fine Tuned Models | 14.4 | 35.7 | 32.42 | 56.39 | 27.09 | 38.68 | 58.88 | 0.76 | LlamaForCausalLM |
| starcoder-finetune-selfinstruct | Fine Tuned Models | 0 | 35.65 | 31.23 | 47.66 | 29.52 | 41.63 | 57.77 | 6.07 | Unknown |
| 20231206094523-pretrain-Llama-2-13b-hf-76000 | Fine Tuned Models | 132.5 | 35.58 | 31.06 | 52.03 | 24.43 | 44.71 | 61.25 | 0.0 | LlamaForCausalLM |
| tinyllama-oasst1-top1-instruct-full-lr1-5-v0.1 | Fine Tuned Models | 11 | 35.58 | 32.85 | 58.16 | 25.96 | 38.35 | 57.7 | 0.45 | LlamaForCausalLM |
| Cinder-1.3B-Test | Fine Tuned Models | 12.8 | 35.57 | 33.19 | 55.48 | 26.37 | 36.62 | 58.96 | 2.81 | LlamaForCausalLM |
| wizard-vicuna-13B-GPTQ | Unkown Model Types | 162.2 | 35.56 | 28.67 | 25.94 | 25.84 | 48.53 | 74.74 | 9.63 | LlamaForCausalLM |
| TinyLlama-1.1B-Chat-v0.3 | Fine Tuned Models | 10.3 | 35.56 | 35.07 | 57.7 | 25.53 | 36.67 | 57.7 | 0.68 | Unknown |
| wangchanglm-7.5B-sft-en-sharded | Fine Tuned Models | 75 | 35.55 | 34.47 | 59.81 | 26.37 | 34.15 | 58.25 | 0.23 | XGLMForCausalLM |
| open-cabrita3b | Fine Tuned Models | 30 | 35.54 | 33.79 | 55.35 | 25.16 | 38.5 | 59.43 | 0.99 | LlamaForCausalLM |
| TinyLlama-1.1B-FFT-Test2 | Fine Tuned Models | 11 | 35.53 | 34.22 | 57.96 | 25.54 | 36.32 | 58.8 | 0.38 | LlamaForCausalLM |
| starchat-alpha | Unkown Model Types | 155.2 | 35.49 | 31.57 | 49.43 | 30.76 | 43.66 | 55.09 | 2.43 | GPTBigCodeForCausalLM |
| ShortKingv0.1 | Fine Tuned Models | 14.2 | 35.45 | 34.22 | 54.59 | 25.78 | 41.64 | 56.04 | 0.45 | Unknown |
| TinyLlama-1.1B-intermediate-step-715k-1.5T-lr-5-2.2epochs-oasst1-top1-instruct-V1 | Fine Tuned Models | 11 | 35.45 | 31.48 | 54.4 | 25.47 | 42.34 | 57.54 | 1.44 | LlamaForCausalLM |
| Nape-0 | Fine Tuned Models | 11 | 35.43 | 32.68 | 58.68 | 24.88 | 38.99 | 57.3 | 0.08 | LlamaForCausalLM |
| starcoder_mirror | Fine Tuned Models | 0 | 35.43 | 31.31 | 45.82 | 29.29 | 43.38 | 57.22 | 5.53 | Unknown |
| TinyLlama-1.1B-intermediate-step-715k-1.5T-lr-5-3epochs-oasst1-top1-instruct-V1 | Fine Tuned Models | 11 | 35.42 | 31.4 | 54.24 | 25.36 | 42.47 | 57.7 | 1.36 | LlamaForCausalLM |
| openchat_v2_openorca_preview-GPTQ | Unkown Model Types | 162.2 | 35.38 | 27.99 | 26.06 | 24.24 | 50.08 | 70.64 | 13.27 | LlamaForCausalLM |
| chopt-1_3b | Fine Tuned Models | 30 | 35.32 | 31.48 | 56.63 | 25.35 | 40.19 | 58.25 | 0.0 | OPTForCausalLM |
| Walter-Llama-1B | Chat Models | 11 | 35.29 | 32.85 | 61.05 | 27.46 | 33.93 | 56.43 | 0.0 | LlamaForCausalLM |
| dopeyplats-1.1b-2T-v1 | Chat Models | 11 | 35.28 | 33.11 | 54.31 | 24.55 | 39.26 | 58.8 | 1.67 | LlamaForCausalLM |
| TinyLlama-1.1B-intermediate-step-715k-1.5T-lr-5-4epochs-oasst1-top1-instruct-V1 | Fine Tuned Models | 11 | 35.28 | 31.14 | 54.31 | 25.42 | 41.72 | 57.77 | 1.29 | LlamaForCausalLM |
| TinyLlama-3T-Cinder-v1.2 | Fine Tuned Models | 11 | 35.26 | 34.39 | 56.51 | 26.14 | 36.78 | 57.7 | 0.08 | LlamaForCausalLM |
| platypus-1_8b | Chat Models | 18.4 | 35.24 | 33.28 | 50.76 | 33.25 | 40.73 | 52.96 | 0.45 | LlamaForCausalLM |
| Deacon-1b | Chat Models | 11 | 35.21 | 32.42 | 58.62 | 24.89 | 35.05 | 59.59 | 0.68 | LlamaForCausalLM |
| opt-iml-max-1.3b | Unkown Model Types | 13 | 35.21 | 30.72 | 53.81 | 27.61 | 38.34 | 60.22 | 0.53 | OPTForCausalLM |
| palmyra-base | Pretrained Models | 0 | 35.18 | 31.91 | 55.39 | 27.15 | 37.57 | 58.09 | 0.99 | GPT2LMHeadModel |