加载中...
加载中...
本页面提供当前主流大模型在代码能力上的评测结果,包括HumanEval和MBPP等基准数据集。
Data source: 论文或GitHub评测结果
| Model | Parameters | HumanEval Pass@1 | MBPP Pass@1 | Organization | License |
|---|---|---|---|---|---|
| Phi 4 - 14B | 140.0 | 82.60 | / | Microsoft Azure | / |
| WizardCoder-Python-13B-V1.0 | 130.0 | 64 | 54.60 | WizardLM Team | / |
| PanGu-Coder2 | 150.0 | 61.64 | / | 华为 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
| / |
| WizardCoder-15B-V1.0 | 150.0 | 57.30 | / | WizardLM Team | / |
| Qwen2.5-14B | 140.0 | 56.70 | 76.70 | 阿里巴巴 | / |
| Moonlight-16B-A3B-Instruct | 160.0 | 48.10 | 63.80 | Moonshot AI | / |
| CodeLLaMA-Python-13B | 130.0 | 43.30 | 49 | Facebook AI研究实验室 | / |
| CodeLLaMA-Instruct-13B | 130.0 | 42.70 | 49.40 | Facebook AI研究实验室 | / |
| WizardLM-30B-V1 | 300.0 | 37.80 | / | WizardLM Team | / |
| CodeLLaMA-13B | 130.0 | 36 | 47 | Facebook AI研究实验室 | / |
| StarCoder | 155.0 | 33.60 | 52.70 | BigCode | / |
| Qwen-14B | 140.0 | 32.30 | 40.80 | 阿里巴巴 | / |
| StarCodeBase | 155.0 | 30.40 | 49 | BigCode | / |
| CodeGeeX | 130.0 | 22.90 | / | 智谱AI | / |
| LLaMA2 13B | 130.0 | 20.10 | 27.60 | Facebook AI研究实验室 | / |
| Baichuan2-13B-Base | 130.0 | 17.07 | 30.20 | 百川智能 | / |
| Baichuan 13B - Base | 130.0 | 11.59 | 22.90 | 百川智能 | / |