DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
  1. Home
  2. /
  3. Benchmarks
  4. /
  5. SWE-bench Multilingual

SWE-bench Multilingual

Updated Apr 24, 2026·837 views
Current SOTA
Anthropic
Claude Mythos Preview
Anthropic
87.30Score
Problem Count
300
Institution
—
Category
Coding and Software Engineering
Metrics
Accuracy
Language
Multilingual
Difficulty
Medium

Overview

SWE-bench Multilingual is an AI benchmark used to evaluate model capabilities. Review its overview, metrics, official resources, and model leaderboard results on DataLearnerAI.

Related resources

  • View Paper
  • Get Dataset
  • Official Website
  • DataLearner Blog

Latest SWE-bench Multilingual model rankings and full benchmark leaderboard

Browse the latest scores, model modes, release dates, and parameter sizes for SWE-bench Multilingual.

Source: DataLearnerAI

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model Mode Legend
License:
Origin:
Model release cutoff:

SWE-bench Multilingual Rank

RankModelLicense
Anthropic
Claude Mythos Preview
Extended ThinkingTools
87.30
2026-04-07UnknownClosed
Moonshot AI
Kimi K2.6
Thinking EnabledTools
76.70
2026-04-201000BFree Commercial
DeepSeek-AI
DeepSeek-V4-Pro
Thinking Level · Extra HighTools
76.20
2026-04-241600BFree Commercial
4
DeepSeek-AI
DeepSeek-V4-Pro
Thinking Level · HighTools
74.10
2026-04-241600BFree Commercial
5
阿里巴巴
Qwen 3.6 Plus Preview
Thinking Enabled
73.80
2026-03-31UnknownClosed
6
Cursor
Composer 2
Thinking Enabled
73.70
2026-03-19UnknownClosed
7
DeepSeek-AI
DeepSeek-V4-Flash
Thinking Level · Extra HighTools
73.30
2026-04-24284BFree Commercial
8
Moonshot AI
Kimi K2.5
Thinking Enabled
73.00
2026-01-271000BFree Commercial
9
Anthropic
Claude Opus 4.6
Extended ThinkingTools
72.00
2026-02-05UnknownClosed
10
阿里巴巴
Qwen3.6-27B
Thinking EnabledTools
71.30
2026-04-2227BFree Commercial
11
DeepSeek-AI
DeepSeek-V4-Flash
Thinking Level · HighTools
70.20
2026-04-24284BFree Commercial
12
DeepSeek-AI
DeepSeek-V4-Pro
Standard ModeTools
69.80
2026-04-241600BFree Commercial
13
DeepSeek-AI
DeepSeek-V4-Flash
Standard ModeTools
69.70
2026-04-24284BFree Commercial
14
阿里巴巴
Qwen3.5-397B-A17B
Thinking Enabled
69.30
2026-02-1639.7BFree Commercial
15
阿里巴巴
Qwen3.6-35B-A3B
Thinking Enabled
67.20
2026-04-1635BFree Commercial
16
Cursor
Composer 1.5
Thinking Enabled
65.90
2026-02-09UnknownClosed
17
Cursor
Composer 1
Thinking Enabled
56.90
2025-10-29UnknownClosed