DataLearner logoDataLearnerAI
AI Tech Blogs
Leaderboards
Benchmarks
Models
Resources
Tool Directory

加载中...

DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

产品

  • Leaderboards
  • 模型对比
  • Datasets

资源

  • Tutorials
  • Editorial
  • Tool directory

关于

  • 关于我们
  • 隐私政策
  • 数据收集方法
  • 联系我们

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

隐私政策服务条款
Back to Main Leaderboard

大模型代码编程能力评测排行榜

本页面提供大模型代码编程能力评测排行榜,涵盖 SWE-Bench、LiveCodeBench、HumanEval 等数据集,对 GPT、Claude、Qwen、DeepSeek 等模型进行对比。

Updated on: 2025/10/12 20:54:51

Benchmark switcher

Pick the leaderboard to sync both chart and table

SWE-bench VerifiedLiveCodeBenchHumanEval

More benchmark coverage

Browse the benchmark catalog by category and language

More Benchmarks

Filters

Active
All3B and below7B13B34B65B100B and above
AllReasoning ModelsFoundation ModelsInstruction/Chat ModelsCoding Models

LLM Performance Results

Data source: DataLearnerAI
RankModelSWE-bench VerifiedLiveCodeBenchHumanEvalParams (B)License
1GLM-4.7-Flash59.200.000.00310BFree commercial
2Devstral Small 1.153.600.000.00240BFree commercial
3
Qwen3-Coder-Flash
51.60
0.00
0.00
305B
Free commercial
4Devstral Small 1.046.800.000.00240BFree commercial
5GPT OSS 20B34.000.000.00210BFree commercial
6Qwen3-30B-A3B-250722.0043.200.00305BFree commercial
7Mistral-Small-3.1-24B-Instruct-25030.000.0088.41240BFree commercial
8Qwen2.5-32B0.0051.2088.40320BFree commercial
9Gemma 3 - 27B (IT)0.0029.7087.80270BFree commercial
10Codestral0.0031.5081.10220BNon-commercial
11C4AI Aya Vision 32B0.000.0062.20320BNon-commercial
12QwQ-32B0.000.0019.00325BFree commercial
13Qwen3-235B-A22B-Thinking0.0074.100.00305BFree commercial
14Qwen3-32B0.0065.700.00320BFree commercial
15Magistral-Small-25060.0055.840.00240BFree commercial
16Qwen3-30B-A3B0.0029.000.00305BFree commercial
1
GLM-4.7-Flash
310B
SWE-bench Verified59.20
LiveCodeBench0.00
HumanEval0.00
Free commercial
2
Devstral Small 1.1
240B
SWE-bench Verified53.60
LiveCodeBench0.00
HumanEval0.00
Free commercial
3
Qwen3-Coder-Flash
305B
SWE-bench Verified51.60
LiveCodeBench0.00
HumanEval0.00
Free commercial
4
Devstral Small 1.0
240B
SWE-bench Verified46.80
LiveCodeBench0.00
HumanEval0.00
Free commercial
5
GPT OSS 20B
210B
SWE-bench Verified34.00
LiveCodeBench0.00
HumanEval0.00
Free commercial
6
Qwen3-30B-A3B-2507
305B
SWE-bench Verified22.00
LiveCodeBench43.20
HumanEval0.00
Free commercial
7
Mistral-Small-3.1-24B-Instruct-2503
240B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval88.41
Free commercial
8
Qwen2.5-32B
320B
SWE-bench Verified0.00
LiveCodeBench51.20
HumanEval88.40
Free commercial
9
Gemma 3 - 27B (IT)
270B
SWE-bench Verified0.00
LiveCodeBench29.70
HumanEval87.80
Free commercial
10
Codestral
220B
SWE-bench Verified0.00
LiveCodeBench31.50
HumanEval81.10
Non-commercial
11
C4AI Aya Vision 32B
320B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval62.20
Non-commercial
12
QwQ-32B
325B
SWE-bench Verified0.00
LiveCodeBench0.00
HumanEval19.00
Free commercial
13
Qwen3-235B-A22B-Thinking
305B
SWE-bench Verified0.00
LiveCodeBench74.10
HumanEval0.00
Free commercial
14
Qwen3-32B
320B
SWE-bench Verified0.00
LiveCodeBench65.70
HumanEval0.00
Free commercial
15
Magistral-Small-2506
240B
SWE-bench Verified0.00
LiveCodeBench55.84
HumanEval0.00
Free commercial
16
Qwen3-30B-A3B
305B
SWE-bench Verified0.00
LiveCodeBench29.00
HumanEval0.00
Free commercial