加载中...
加载中...
Open LLM Leaderboard是追踪大模型评测结果的排行榜,通过追踪大语言模型和ChatBot在不同评测任务上的表现来对模型进行排名和评估。
Data source: HuggingFace
| Model | Type | Parameters (B) | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | Architecture |
|---|---|---|---|---|---|---|---|---|---|---|
| mistral-class-tutor-7b-ep3 | Fine Tuned Models | 72.4 | 46.09 | 47.95 | 77.8 | 34.57 | 44.69 | 71.51 | 0.0 | MistralForCausalLM |
| bloom | Pretrained Models | 1762.5 | 46.07 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| 50.43 |
| 76.41 |
| 30.85 |
| 39.76 |
| 72.06 |
| 6.9 |
| BloomForCausalLM |
| vicuna-7b-v1.5-lora-mctaco-modified2 | Fine Tuned Models | 66.1 | 46.03 | 42.92 | 73.97 | 48.49 | 40.43 | 69.69 | 0.68 | Unknown |
| Ambari-7B-base-v0.1-sharded | Fine Tuned Models | 68.8 | 45.92 | 47.95 | 74.62 | 40.39 | 38.91 | 72.06 | 1.59 | LlamaForCausalLM |
| ssh_1.8B | Fine Tuned Models | 18.4 | 45.91 | 39.08 | 62.37 | 44.09 | 43.15 | 59.27 | 27.52 | LlamaForCausalLM |
| quan-1.8b-chat | Chat Models | 18 | 45.91 | 39.08 | 62.37 | 44.09 | 43.15 | 59.27 | 27.52 | LlamaForCausalLM |
| mistral_v1 | Fine Tuned Models | 72.4 | 45.85 | 47.01 | 67.58 | 48.68 | 37.53 | 64.8 | 9.48 | MistralForCausalLM |
| CodeLlama-13b-Instruct-hf | Chat Models | 130.2 | 45.82 | 44.54 | 64.93 | 38.89 | 45.88 | 68.03 | 12.66 | LlamaForCausalLM |
| CodeLlama-13B-Instruct-fp16 | Chat Models | 130.2 | 45.82 | 44.62 | 64.94 | 38.77 | 45.88 | 68.03 | 12.66 | LlamaForCausalLM |
| Kan-LLaMA-7B-SFT-v0.1-sharded | Fine Tuned Models | 68.8 | 45.76 | 45.9 | 71.43 | 40.86 | 45.04 | 68.82 | 2.5 | LlamaForCausalLM |
| Ambari-7B-Instruct-v0.1-sharded | Fine Tuned Models | 68.8 | 45.74 | 50.0 | 74.59 | 38.03 | 40.39 | 69.53 | 1.9 | LlamaForCausalLM |
| llama2-7b-raw-sft | Fine Tuned Models | 67.4 | 45.67 | 47.44 | 75.25 | 33.86 | 40.77 | 73.01 | 3.71 | LlamaForCausalLM |
| mistral-7b-raw-sft | Fine Tuned Models | 67.4 | 45.67 | 47.44 | 75.25 | 33.86 | 40.77 | 73.01 | 3.71 | LlamaForCausalLM |
| Planner-7B-fp16 | Fine Tuned Models | 70 | 45.65 | 51.02 | 77.82 | 35.71 | 34.33 | 71.43 | 3.56 | LlamaForCausalLM |
| speechless-codellama-platypus-13b | Chat Models | 130 | 45.64 | 45.31 | 68.63 | 42.82 | 42.38 | 65.59 | 9.1 | LlamaForCausalLM |
| llama-base-7b | Pretrained Models | 66.1 | 45.62 | 50.94 | 77.8 | 35.67 | 34.34 | 71.43 | 3.56 | Unknown |
| PandaLM-Alpaca-7B-v1 | Fine Tuned Models | 70 | 45.59 | 50.85 | 77.36 | 35.91 | 36.63 | 71.9 | 0.91 | LlamaForCausalLM |
| Airavata | Fine Tuned Models | 68.7 | 45.52 | 46.5 | 69.26 | 43.9 | 40.62 | 68.82 | 4.02 | LlamaForCausalLM |
| tamil-llama-7b-instruct-v0.1 | Fine Tuned Models | 70 | 45.52 | 48.04 | 70.97 | 39.95 | 41.7 | 70.64 | 1.82 | LlamaForCausalLM |
| chinese-llama-plus-13b-hf | Fine Tuned Models | 130 | 45.39 | 46.25 | 71.88 | 40.74 | 39.89 | 73.09 | 0.53 | LlamaForCausalLM |
| vicuna-7b-v1.5-lora-mctaco-modified1 | Fine Tuned Models | 66.1 | 45.38 | 40.87 | 73.4 | 47.42 | 39.87 | 69.46 | 1.29 | Unknown |
| openthaigpt-1.0.0-beta-7b-chat-ckpt-hf | Fine Tuned Models | 70 | 45.35 | 44.97 | 70.19 | 36.22 | 49.99 | 69.38 | 1.36 | LlamaForCausalLM |
| ALMA-7B | Fine Tuned Models | 70 | 45.32 | 50.34 | 75.5 | 38.04 | 35.64 | 72.38 | 0.0 | LlamaForCausalLM |
| MiniChat-3B | Fine Tuned Models | 30.2 | 45.31 | 44.03 | 67.19 | 39.17 | 45.67 | 65.27 | 10.54 | LlamaForCausalLM |
| opt-iml-max-30b | Unkown Model Types | 300 | 45.28 | 43.86 | 72.39 | 41.09 | 38.16 | 73.72 | 2.5 | OPTForCausalLM |
| openbuddy-openllama-7b-v12-bf16 | Fine Tuned Models | 70 | 45.28 | 42.06 | 62.01 | 46.53 | 45.18 | 65.04 | 10.84 | LlamaForCausalLM |
| stablelm-2-1_6b | Pretrained Models | 16.4 | 45.25 | 43.34 | 70.45 | 38.95 | 36.78 | 64.56 | 17.44 | Unknown |
| HamSter-0.1 | Fine Tuned Models | 72.4 | 45.19 | 46.93 | 68.08 | 43.03 | 51.24 | 61.88 | 0.0 | MistralForCausalLM |
| llama-shishya-7b-ep3-v1 | Fine Tuned Models | 70 | 45.19 | 48.04 | 76.63 | 46.12 | 30.9 | 69.46 | 0.0 | LlamaForCausalLM |
| guanaco-unchained-llama-2-7b | Fine Tuned Models | 70 | 45.11 | 47.35 | 72.16 | 41.76 | 41.49 | 64.48 | 3.41 | Unknown |
| speechless-coding-7b-16k-tora | Fine Tuned Models | 70 | 45.1 | 41.21 | 64.45 | 39.14 | 44.91 | 63.61 | 17.29 | LlamaForCausalLM |
| vicuna-7b-v1.5-lora-mctaco-modified4 | Fine Tuned Models | 66.1 | 45.1 | 40.7 | 73.08 | 47.26 | 41.59 | 67.88 | 0.08 | Unknown |
| speechless-coding-7b-16k-tora | Fine Tuned Models | 70 | 45.05 | 41.13 | 64.48 | 38.86 | 44.95 | 63.85 | 17.06 | LlamaForCausalLM |
| Qwen-VL-LLaMAfied-7B-Chat | Fine Tuned Models | 70 | 45.0 | 47.35 | 69.97 | 44.12 | 42.87 | 65.67 | 0.0 | LlamaForCausalLM |
| llama-7b-logicot | Unkown Model Types | 70 | 44.95 | 47.01 | 72.56 | 38.93 | 43.63 | 67.56 | 0.0 | LlamaForCausalLM |
| WizardLM-7B-Uncensored | Fine Tuned Models | 66.1 | 44.92 | 47.87 | 73.08 | 35.42 | 41.49 | 68.43 | 3.26 | Unknown |
| codellama-13b-oasst-sft-v10 | Chat Models | 130.2 | 44.85 | 45.39 | 62.36 | 35.36 | 45.02 | 67.8 | 13.19 | LlamaForCausalLM |
| CodeLLaMA-chat-13b-Chinese | Fine Tuned Models | 128.5 | 44.84 | 43.26 | 63.87 | 34.29 | 48.97 | 67.88 | 10.77 | Unknown |
| speechless-codellama-orca-13b | Chat Models | 130 | 44.83 | 44.37 | 65.2 | 43.46 | 45.94 | 64.01 | 5.99 | LlamaForCausalLM |
| mistral-class-shishya-all-hal-7b-ep3 | Fine Tuned Models | 70 | 44.8 | 46.59 | 78.87 | 34.45 | 35.98 | 72.93 | 0.0 | MistralForCausalLM |
| chinese-alpaca-plus-7b-hf | Fine Tuned Models | 70 | 44.77 | 49.23 | 70.48 | 38.39 | 39.72 | 70.09 | 0.68 | LlamaForCausalLM |
| MiniMA-2-3B | Fine Tuned Models | 30 | 44.75 | 44.71 | 69.33 | 41.22 | 38.44 | 66.69 | 8.11 | LlamaForCausalLM |
| Qwen-1_8B-Llamafied | Pretrained Models | 18.4 | 44.75 | 37.71 | 58.87 | 46.37 | 39.41 | 61.72 | 24.41 | LlamaForCausalLM |
| palmyra-med-20b | Chat Models | 200 | 44.71 | 46.93 | 73.51 | 44.34 | 35.47 | 65.35 | 2.65 | GPT2LMHeadModel |
| Poro-34B-GPTQ | Chat Models | 480.6 | 44.67 | 47.01 | 73.75 | 32.47 | 38.37 | 71.35 | 5.08 | BloomForCausalLM |
| ThetaWave-14B-v0.1 | Pretrained Models | 142.2 | 44.54 | 42.83 | 47.09 | 61.45 | 50.41 | 65.43 | 0.0 | MistralForCausalLM |
| tamil-llama-7b-base-v0.1 | Fine Tuned Models | 70 | 44.52 | 46.67 | 72.85 | 40.95 | 35.93 | 70.72 | 0.0 | LlamaForCausalLM |
| Project-Baize-v2-7B-GPTQ | Unkown Model Types | 90.4 | 44.5 | 45.99 | 73.44 | 35.46 | 39.92 | 69.69 | 2.5 | LlamaForCausalLM |
| h2o-danube-1.8b-chat | Chat Models | 18.3 | 44.49 | 41.13 | 68.06 | 33.41 | 41.64 | 65.35 | 17.36 | MistralForCausalLM |
| falcon_7b_norobots | Chat Models | 70 | 44.46 | 47.87 | 77.92 | 27.94 | 36.81 | 71.74 | 4.47 | Unknown |
| falcon_7b_norobots | Fine Tuned Models | 70 | 44.4 | 48.12 | 77.9 | 28.11 | 36.76 | 71.59 | 3.94 | Unknown |
| rank_vicuna_7b_v1_fp16 | Fine Tuned Models | 70 | 44.36 | 44.62 | 65.67 | 44.14 | 45.13 | 66.61 | 0.0 | LlamaForCausalLM |
| llama-shishya-7b-ep3-v2 | Fine Tuned Models | 70 | 44.33 | 47.35 | 75.88 | 43.84 | 30.16 | 68.75 | 0.0 | LlamaForCausalLM |
| CodeLlama-34b-Instruct-hf | Chat Models | 337.4 | 44.33 | 40.78 | 35.66 | 39.72 | 44.29 | 74.51 | 31.01 | LlamaForCausalLM |
| koala-7B-HF | Fine Tuned Models | 70 | 44.29 | 47.1 | 73.58 | 25.53 | 45.96 | 69.93 | 3.64 | LlamaForCausalLM |
| mistral-class-shishya-7b-ep3 | Fine Tuned Models | 70 | 44.28 | 46.59 | 76.62 | 39.07 | 33.54 | 69.85 | 0.0 | MistralForCausalLM |
| open_llama_7b_v2 | Pretrained Models | 70 | 44.26 | 43.69 | 72.2 | 41.29 | 35.54 | 69.38 | 3.49 | LlamaForCausalLM |
| tora-code-13b-v1.0 | Fine Tuned Models | 130 | 44.19 | 44.71 | 69.15 | 36.69 | 34.98 | 63.14 | 16.45 | LlamaForCausalLM |
| falcon-7b | Pretrained Models | 70 | 44.17 | 47.87 | 78.13 | 27.79 | 34.26 | 72.38 | 4.62 | FalconForCausalLM |
| speechless-codellama-airoboros-orca-platypus-13b | Fine Tuned Models | 130 | 44.1 | 44.88 | 67.7 | 43.16 | 40.88 | 66.14 | 1.82 | LlamaForCausalLM |
| falcon_7b_DolphinCoder | Fine Tuned Models | 70 | 44.09 | 48.72 | 78.03 | 27.08 | 35.12 | 70.48 | 5.08 | Unknown |
| falcon_7b_DolphinCoder | Fine Tuned Models | 70 | 44.09 | 48.72 | 78.03 | 27.08 | 35.12 | 70.48 | 5.08 | Unknown |
| calm2-7b-chat-dpo-experimental | Chat Models | 70.1 | 44.03 | 41.04 | 68.99 | 39.82 | 43.13 | 65.67 | 5.53 | LlamaForCausalLM |
| llama-class-shishya-7b-ep3 | Fine Tuned Models | 70 | 43.88 | 40.78 | 77.04 | 46.74 | 27.94 | 70.8 | 0.0 | LlamaForCausalLM |
| BigTranslate-13B-GPTQ | Fine Tuned Models | 179.9 | 43.86 | 45.31 | 75.1 | 31.18 | 40.6 | 70.96 | 0.0 | LlamaForCausalLM |
| gpt-sw3-20b-instruct | Chat Models | 209.2 | 43.7 | 43.17 | 71.09 | 31.32 | 41.02 | 66.77 | 8.79 | GPT2LMHeadModel |
| h2o-danube-1.8b-sft | Fine Tuned Models | 18.3 | 43.68 | 40.19 | 67.34 | 33.75 | 40.29 | 65.43 | 15.09 | MistralForCausalLM |
| falcon_7b_3epoch_norobots | Chat Models | 70 | 43.65 | 47.61 | 77.24 | 29.73 | 36.27 | 69.53 | 1.52 | Unknown |
| deepseek-coder-6.7b-instruct | Chat Models | 67.4 | 43.57 | 38.14 | 55.09 | 39.02 | 45.56 | 56.83 | 26.76 | LlamaForCausalLM |
| amber_fine_tune_sg_part1 | Fine Tuned Models | 67.4 | 43.5 | 44.88 | 75.1 | 29.36 | 40.85 | 67.01 | 3.79 | LlamaForCausalLM |
| gpt-sw3-40b | Pretrained Models | 399.3 | 43.42 | 43.0 | 72.37 | 34.97 | 37.52 | 67.96 | 4.7 | GPT2LMHeadModel |
| minima-3b-layla-v2 | Fine Tuned Models | 30 | 43.39 | 44.2 | 69.93 | 28.53 | 43.64 | 65.43 | 8.64 | LlamaForCausalLM |
| CodeLlama-13b-hf | Pretrained Models | 130.2 | 43.35 | 40.87 | 63.35 | 32.81 | 43.79 | 67.17 | 12.13 | LlamaForCausalLM |
| tigerbot-7b-sft | Unkown Model Types | 70.7 | 43.35 | 41.64 | 60.56 | 29.89 | 58.18 | 63.54 | 6.29 | Unknown |
| quan-1.8b-base | Pretrained Models | 18 | 43.35 | 36.95 | 58.46 | 45.44 | 41.6 | 57.93 | 19.71 | LlamaForCausalLM |
| Kan-LLaMA-7B-base | Fine Tuned Models | 68.8 | 43.31 | 43.94 | 70.75 | 37.06 | 39.57 | 68.51 | 0.0 | LlamaForCausalLM |
| amber_fine_tune_001 | Fine Tuned Models | 67.4 | 43.28 | 44.8 | 73.78 | 30.41 | 42.93 | 64.09 | 3.64 | LlamaForCausalLM |
| calm2-7b-chat | Chat Models | 70 | 43.27 | 40.27 | 68.12 | 39.39 | 41.96 | 64.96 | 4.93 | LlamaForCausalLM |
| falcon-7b-instruct | Chat Models | 70 | 43.26 | 46.16 | 70.85 | 25.84 | 44.08 | 67.96 | 4.7 | FalconForCausalLM |
| Guanaco | Unkown Model Types | 0 | 43.25 | 50.17 | 72.69 | 30.3 | 37.64 | 68.67 | 0.0 | LlamaForCausalLM |
| minima-3b-layla-v1 | Unkown Model Types | 30 | 43.21 | 42.32 | 67.48 | 28.44 | 46.46 | 65.9 | 8.64 | LlamaForCausalLM |
| falcon-7b-instruct | Chat Models | 70 | 43.16 | 45.82 | 70.78 | 25.66 | 44.07 | 68.03 | 4.62 | FalconForCausalLM |
| chinese-llama-2-7b | Fine Tuned Models | 67 | 43.14 | 44.45 | 69.5 | 37.47 | 37.0 | 68.98 | 1.44 | Unknown |
| GPT-JT-6B-v1 | Fine Tuned Models | 60 | 43.13 | 40.87 | 67.15 | 47.19 | 37.07 | 65.27 | 1.21 | GPTJForCausalLM |
| ex-llm-e1 | Chat Models | 0 | 43.11 | 39.93 | 68.11 | 39.44 | 42.01 | 64.88 | 4.32 | Unknown |
| phoenix-inst-chat-7b | Unkown Model Types | 70 | 43.03 | 44.71 | 63.23 | 39.06 | 47.08 | 62.83 | 1.29 | BloomForCausalLM |
| GPT-NeoXT-Chat-Base-20B | Fine Tuned Models | 200 | 43.02 | 45.65 | 74.03 | 29.92 | 34.51 | 67.09 | 6.9 | GPTNeoXForCausalLM |
| galpaca-30b | Fine Tuned Models | 300 | 43.0 | 49.57 | 58.2 | 43.78 | 41.16 | 62.51 | 2.81 | OPTForCausalLM |
| CodeLlama-34B-Instruct-fp16 | Chat Models | 337.4 | 43.0 | 40.78 | 35.66 | 39.72 | 44.29 | 74.51 | 23.05 | LlamaForCausalLM |
| Anima-7B-100K | Fine Tuned Models | 70 | 42.98 | 46.59 | 72.28 | 33.4 | 37.84 | 67.09 | 0.68 | LlamaForCausalLM |
| Deita-1_8B | Chat Models | 80 | 42.96 | 36.52 | 60.63 | 45.62 | 40.02 | 59.35 | 15.62 | LlamaForCausalLM |
| Qwen-1_8B-Chat-llama | Chat Models | 18.4 | 42.94 | 36.95 | 54.34 | 44.55 | 43.7 | 58.88 | 19.26 | LlamaForCausalLM |
| InstructPalmyra-20b | Chat Models | 200 | 42.91 | 47.1 | 73.0 | 28.26 | 41.81 | 64.72 | 2.58 | GPT2LMHeadModel |
| dopeyshearedplats-2.7b-v1 | Fine Tuned Models | 27 | 42.9 | 46.08 | 75.17 | 29.01 | 44.12 | 62.67 | 0.38 | LlamaForCausalLM |
| landmark-attention-llama7b-fp16 | Fine Tuned Models | 66.1 | 42.84 | 47.35 | 65.81 | 31.59 | 42.63 | 68.03 | 1.59 | Unknown |
| opt-66b | Pretrained Models | 660 | 42.78 | 46.33 | 76.25 | 26.99 | 35.43 | 70.01 | 1.67 | OPTForCausalLM |
| Qwen-1_8b-EverythingLM | Chat Models | 18.4 | 42.77 | 38.65 | 62.66 | 44.94 | 38.7 | 58.96 | 12.74 | LlamaForCausalLM |
| tora-code-13b-v1.0 | Chat Models | 130 | 42.7 | 44.45 | 69.29 | 36.67 | 34.98 | 62.59 | 8.19 | LlamaForCausalLM |
| open-llama-7b-open-instruct | Unkown Model Types | 70 | 42.59 | 49.74 | 73.67 | 31.52 | 34.65 | 65.43 | 0.53 | LlamaForCausalLM |
| codegen-16B-nl | Pretrained Models | 160 | 42.59 | 46.76 | 71.87 | 32.35 | 33.95 | 67.96 | 2.65 | CodeGenForCausalLM |