DataLearner logoDataLearnerAI
AI Tech Blogs
Leaderboards
Benchmarks
Models
Resources
Tool Directory

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

产品

  • Leaderboards
  • 模型对比
  • Datasets

资源

  • Tutorials
  • Editorial
  • Tool directory

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

隐私政策服务条款
Back to Main Leaderboard

大模型代码编程能力评测排行榜

本页面提供大模型代码编程能力评测排行榜,涵盖 SWE-Bench、LiveCodeBench、HumanEval 等数据集,对 GPT、Claude、Qwen、DeepSeek 等模型进行对比。

Updated on: 2025/10/12 20:54:51

Benchmark switcher

Pick the leaderboard to sync both chart and table

SWE-bench VerifiedLiveCodeBenchHumanEval

More benchmark coverage

Browse the benchmark catalog by category and language

More Benchmarks

Filters

Active
All3B and below7B13B34B65B100B and above
AllReasoning ModelsFoundation ModelsInstruction/Chat ModelsCoding Models

LLM Performance Results

Data source: DataLearnerAI
No chart data available
RankModelSWE-bench VerifiedLiveCodeBenchHumanEvalParams (B)License
1Llama3.3-70B-Instruct0.0033.3088.40700BFree commercial
2Llama3.1-70B-Instruct0.0033.3080.50700BFree commercial
3Qwen2.5-72B0.000.0059.10727BFree commercial
4Hunyuan-A13B-Instruct0.0063.900.00800BFree commercial
5Pangu Pro MoE0.0059.600.00719BFree commercial
6Qwen3-Next0.0056.600.00800BFree commercial
1
Llama3.3-70B-Instruct
700B
SWE-bench Verified0.00
LiveCodeBench33.30
HumanEval88.40
Free commercial
2
Llama3.1-70B-Instruct
700B
SWE-bench Verified0.00
LiveCodeBench33.30
HumanEval80.50
Free commercial
3
Qwen2.5-72B
727B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval59.10
Free commercial
4
Hunyuan-A13B-Instruct
800B
SWE-bench Verified0.00
LiveCodeBench63.90
HumanEval0.00
Free commercial
5
Pangu Pro MoE
719B
SWE-bench Verified0.00
LiveCodeBench59.60
HumanEval0.00
Free commercial
6
Qwen3-Next
800B
SWE-bench Verified0.00
LiveCodeBench56.60
HumanEval0.00
Free commercial