DataLearner logoDataLearnerAI
AI Tech Blogs
Leaderboards
Benchmarks
Models
Resources
Tool Directory

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

产品

  • Leaderboards
  • 模型对比
  • Datasets

资源

  • Tutorials
  • Editorial
  • Tool directory

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

隐私政策服务条款
← Back to Main Leaderboard

大模型编程能力评测排行榜

本页面提供当前主流大模型在代码能力上的评测结果,包括HumanEval和MBPP等基准数据集。

Data source: 论文或GitHub评测结果

Filters

Filter by size:All3B and below7B13B34B65B100B and above
ModelParametersHumanEval Pass@1MBPP Pass@1OrganizationLicense
OpenAI o1-mini/92.40/OpenAI/
Claude 3.5 Sonnet/92/Anthropic/
Llama3.1-405B Instruct4050.08988.60

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

Facebook AI研究实验室
/
DeepSeek V2.52360.089/DeepSeek-AI/
Amazon Nova Pro/89/亚马逊/
Grok 22690.088.40/xAI/
Codestral 25.01/86.6080.20MistralAI/
GPT-41750.085.4083.50OpenAI/
Amazon Nova Lite/85.40/亚马逊/
Llama3-400B-Instruct-InTraining4000.084.10/Facebook AI研究实验室/
DeepSeek-V36810.082.60/DeepSeek-AI/
Amazon Nova Micro/81.10/亚马逊/
C4AI Command A (202503)1110.080/CohereAI/
Grok-1.5/74.10/xAI/
DeepSeek-V2-236B-Chat2360.073.8061.40DeepSeek-AI/
Qwen2.5-Max/73.2080.60阿里巴巴/
DBRX Instruct1320.070.10/databricks/
DeepSeek-V3-Base6810.065.2075.40DeepSeek-AI/
Grok-13140.063.20/xAI/
Qwen1.5-110B1100.052.4058.10阿里巴巴/
GPT-3.51750.048.1052.20OpenAI/
Mixtral-8×22B-MoE1410.045.1071.20MistralAI/
DeepSeek-V2-236B2360.040.9066.60DeepSeek-AI/
PaLM-Coder5400.035.9047Google Research/
Codex1750.028.81/OpenAI/
PaLM5400.026.2047Google Research/