LLM Coding Leaderboard
This page provides current LLM coding evaluation results, including HumanEval and MBPP Pass@1 scores.
Top Model
Phi 4 - 14B
Top Score
-
Model Count
17
Data version
-
Data source: 论文或GitHub评测结果
Ranking Table
| Model | Parameters | HumanEval Pass@1 | MBPP Pass@1 | Organization | License |
|---|---|---|---|---|---|
Phi 4 - 14BMicrosoft Azure | 140 | 82.60 | — | Microsoft Azure | — |
WizardCoder-Python-13B-V1.0WizardLM Team | 130 | 64.00 | 54.60 | WizardLM Team | — |
PanGu-Coder2华为 | 150 | 61.64 | — | 华为 | — |
WizardCoder-15B-V1.0WizardLM Team | 150 | 57.30 | — | WizardLM Team | — |
Qwen2.5-14B阿里巴巴 | 140 | 56.70 | 76.70 | 阿里巴巴 | — |
Moonlight-16B-A3B-InstructMoonshot AI | 160 | 48.10 | 63.80 | Moonshot AI | — |
CodeLLaMA-Python-13BFacebook AI研究实验室 | 130 | 43.30 | 49.00 | Facebook AI研究实验室 | — |
CodeLLaMA-Instruct-13BFacebook AI研究实验室 | 130 | 42.70 | 49.40 | Facebook AI研究实验室 | — |
WizardLM-30B-V1WizardLM Team | 300 | 37.80 | — | WizardLM Team | — |
CodeLLaMA-13BFacebook AI研究实验室 | 130 | 36.00 | 47.00 | Facebook AI研究实验室 | — |
StarCoderBigCode | 155 | 33.60 | 52.70 | BigCode | — |
Qwen-14B阿里巴巴 | 140 | 32.30 | 40.80 | 阿里巴巴 | — |
StarCodeBaseBigCode | 155 | 30.40 | 49.00 | BigCode | — |
CodeGeeX智谱AI | 130 | 22.90 | — | 智谱AI | — |
LLaMA2 13BFacebook AI研究实验室 | 130 | 20.10 | 27.60 | Facebook AI研究实验室 | — |
Baichuan2-13B-Base百川智能 | 130 | 17.07 | 30.20 | 百川智能 | — |
Baichuan 13B - Base百川智能 | 130 | 11.59 | 22.90 | 百川智能 | — |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.








