DataLearner logoDataLearnerAI
AI Tech Blogs
Leaderboards
Benchmarks
Models
Resources
Tool Directory

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

产品

  • Leaderboards
  • 模型对比
  • Datasets

资源

  • Tutorials
  • Editorial
  • Tool directory

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

隐私政策服务条款
Back to Main Leaderboard

大模型代码编程能力评测排行榜

本页面提供大模型代码编程能力评测排行榜,涵盖 SWE-Bench、LiveCodeBench、HumanEval 等数据集,对 GPT、Claude、Qwen、DeepSeek 等模型进行对比。

Updated on: 2025/10/12 20:54:51

Benchmark switcher

Pick the leaderboard to sync both chart and table

SWE-bench VerifiedLiveCodeBenchHumanEval

More benchmark coverage

Browse the benchmark catalog by category and language

More Benchmarks

Filters

Active
All3B and below7B13B34B65B100B and above
AllReasoning ModelsFoundation ModelsInstruction/Chat ModelsCoding Models

LLM Performance Results

Data source: DataLearnerAI
RankModelSWE-bench VerifiedLiveCodeBenchHumanEvalParams (B)License
1Qwen3-Coder-Next70.600.000.0080BFree commercial
2Llama3.1-8B-Instruct0.000.0066.5080BFree commercial
3Qwen2.5-7B0.000.0057.9070BFree commercial
4Gemma 2 - 9B0.000.0037.8090BFree commercial
5Llama3.1-8B0.000.0033.5080BFree commercial
6Mistral-7B-Instruct-v0.30.000.0029.3070BFree commercial
7Pangu Embedded0.0067.100.0070BFree commercial
8Qwen3-8B0.0061.800.0080BFree commercial
9Hunyuan-7B0.0057.000.0070BFree commercial
10Qwen3-4B-Thinking-25070.0055.200.0040BFree commercial
11GLM-4-9B-Chat0.0051.800.0090BFree commercial
12Qwen3-4B-25070.0035.100.0040BFree commercial
1
Qwen3-Coder-Next
80B
SWE-bench Verified70.60
LiveCodeBench0.00
HumanEval0.00
Free commercial
2
Llama3.1-8B-Instruct
80B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval66.50
Free commercial
3
Qwen2.5-7B
70B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval57.90
Free commercial
4
Gemma 2 - 9B
90B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval37.80
Free commercial
5
Llama3.1-8B
80B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval33.50
Free commercial
6
Mistral-7B-Instruct-v0.3
70B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval29.30
Free commercial
7
Pangu Embedded
70B
SWE-bench Verified0.00
LiveCodeBench67.10
HumanEval0.00
Free commercial
8
Qwen3-8B
80B
SWE-bench Verified0.00
LiveCodeBench61.80
HumanEval0.00
Free commercial
9
Hunyuan-7B
70B
SWE-bench Verified0.00
LiveCodeBench57.00
HumanEval0.00
Free commercial
10
Qwen3-4B-Thinking-2507
40B
SWE-bench Verified0.00
LiveCodeBench55.20
HumanEval0.00
Free commercial
11
GLM-4-9B-Chat
90B
SWE-bench Verified0.00
LiveCodeBench51.80
HumanEval0.00
Free commercial
12
Qwen3-4B-2507
40B
SWE-bench Verified0.00
LiveCodeBench35.10
HumanEval0.00
Free commercial