LLM Coding Leaderboard

Name: LLM Coding Leaderboard
Creator: DataLearner
License: https://creativecommons.org/licenses/by/4.0/

This page provides current LLM coding evaluation results, including HumanEval and MBPP Pass@1 scores.

Top Model

Qwen2.5-Coder-32B-Instruct

Top Score

Model Count

Data version

Data source: 论文或GitHub评测结果

Leaderboard snapshot month:

Ranking Table

Model	Parameters	HumanEval Pass@1	MBPP Pass@1	Organization	License
Qwen2.5-Coder-32B-Instruct阿里巴巴	320	92.70	90.20	阿里巴巴	—
Mistral Small 24B Instruct 2501MistralAI	240	84.80	—	MistralAI	—
DeepSeek Coder-33B InstructDeepSeek-AI	330	79.30	70.00	DeepSeek-AI	—
WizardCoder-Python-34BWizardLM Team	340	73.20	—	WizardLM Team	—
Phind-CodeLlama-34B-Python-v1Phind	340	69.50	—	Phind	—
Phind-CodeLlama-34B-v1Phind	340	67.60	—	Phind	—
CodestralMistralAI	220	61.50	78.20	MistralAI	—
Qwen2.5-32B阿里巴巴	320	58.50	84.50	阿里巴巴	—
CodeLLaMA-Python-34BFacebook AI研究实验室	340	53.70	56.20	Facebook AI研究实验室	—
YAYI2-30B中科闻歌	300	53.10	45.80	中科闻歌	—
CodeLLaMA-34BFacebook AI研究实验室	340	48.80	55.00	Facebook AI研究实验室	—
Yi-1.5-34B零一万物	340	46.30	65.50	零一万物	—
CodeLLaMA-Instruct-34BFacebook AI研究实验室	340	41.50	57.00	Facebook AI研究实验室	—
Grok-0xAI	330	39.70	—	xAI	—
Qwen1.5-32B阿里巴巴	320	37.20	49.40	阿里巴巴	—
Aquila2-34B北京智源人工智能研究院	340	35.40	—	北京智源人工智能研究院	—
XVERSE-MoE-A4.2B元象XVERSE	258	29.90	—	元象XVERSE	—
LLaMA2 34BFacebook AI研究实验室	340	22.60	33.80	Facebook AI研究实验室	—
Mistral Small 24B Base2501MistralAI	240	—	69.64	MistralAI	—

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.