LLM Coding Leaderboard

Name: LLM Coding Leaderboard
Creator: DataLearner
License: https://creativecommons.org/licenses/by/4.0/

This page provides current LLM coding evaluation results, including HumanEval and MBPP Pass@1 scores.

Top Model

Llama3.3-70B-Instruct

Top Score

Model Count

Data version

Data source: 论文或GitHub评测结果

Leaderboard snapshot month:

Ranking Table

Model	Parameters	HumanEval Pass@1	MBPP Pass@1	Organization	License
Llama3.3-70B-InstructFacebook AI研究实验室	700	88.40	87.60	Facebook AI研究实验室	—
Qwen2-72B-Instruct阿里巴巴	720	86.00	80.20	阿里巴巴	—
Llama3-70BFacebook AI研究实验室	700	81.70	—	Facebook AI研究实验室	—
Llama3-70B-InstructFacebook AI研究实验室	700	81.70	—	Facebook AI研究实验室	—
Llama3.1-70B-InstructFacebook AI研究实验室	700	80.50	86.00	Facebook AI研究实验室	—
Gemini-proDeepMind	1,000	67.70	—	DeepMind	—
Qwen2-72B阿里巴巴	727	64.60	76.90	阿里巴巴	—
Qwen2.5-72B阿里巴巴	727	59.10	84.70	阿里巴巴	—
Qwen2-57B-A14B阿里巴巴	570	53.00	71.90	阿里巴巴	—
Qwen1.5-72B-Chat阿里巴巴	720	41.50	53.40	阿里巴巴	—
Mixtral-8×7B-MoEMistralAI	450	40.20	60.70	MistralAI	—
Qwen-72B阿里巴巴	720	35.40	52.20	阿里巴巴	—
LLaMA2 70BFacebook AI研究实验室	700	30.50	45.40	Facebook AI研究实验室	—
XVERSE-65B元象XVERSE	650	26.80	—	元象XVERSE	—

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.