Aider-Polyglot is an AI benchmark used to evaluate model capabilities. Review its overview, metrics, official resources, and model leaderboard results on DataLearnerAI.
Browse the latest scores, model modes, release dates, and parameter sizes for Aider-Polyglot.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
| Rank | Model | License | |||
|---|---|---|---|---|---|
![]() GPT-5 Thinking Level · High | 88.00 | 2025-08-07 | Unknown | Closed | |
![]() GPT-5 Thinking Level · Medium | 86.70 | 2025-08-07 | Unknown | Closed | |
![]() o3-pro Thinking Level · High | 84.90 | 2025-06-10 | Unknown | Closed | |
4 | 83.10 | 2025-06-05 | Unknown | Closed | |
5 | ![]() OpenAI o3 Thinking Level · High | 81.30 | 2025-04-16 | Unknown | Closed |
6 | ![]() GPT-5 Thinking Level · Low | 81.30 | 2025-08-07 | Unknown | Closed |
7 | Grok 4 Thinking Level · High | 79.60 | 2025-07-10 | Unknown | Closed |
8 | ![]() Gemini 2.5-Pro Thinking Enabled | 79.10 | 2025-06-05 | Unknown | Closed |
9 | ![]() OpenAI o3 Standard Mode | 76.90 | 2025-04-16 | Unknown | Closed |
10 | ![]() Gemini-2.5-Pro-Preview-05-06 Standard Mode | 76.90 | 2025-05-06 | Unknown | Closed |
11 | ![]() DeepSeek V3.2-Exp Thinking Enabled | 74.20 | 2025-09-29 | 671B | Free Commercial |
12 | ![]() Gemini 2.5 Pro Experimental 03-25 Standard Mode | 72.90 | 2025-03-25 | Unknown | Closed |
13 | ![]() OpenAI o4 - mini Thinking Level · High | 72.00 | 2025-04-16 | Unknown | Closed |
14 | 72.00 | 2025-05-23 | Unknown | Closed | |
15 | ![]() DeepSeek-R1-0528 Thinking Enabled | 71.40 | 2025-05-28 | 671B | Free Commercial |
16 | ![]() Claude Opus 4 Standard Mode | 70.70 | 2025-05-23 | Unknown | Closed |
17 | ![]() DeepSeek V3.2-Exp Standard Mode | 70.20 | 2025-09-29 | 671B | Free Commercial |
18 | 64.90 | 2025-02-25 | Unknown | Closed | |
19 | ![]() OpenAI o1 Thinking Level · High | 61.70 | 2024-12-05 | Unknown | Closed |
20 | 61.30 | 2025-05-23 | Unknown | Closed | |
21 | ![]() Claude Sonnet 3.7 Standard Mode | 60.40 | 2025-02-25 | Unknown | Closed |
22 | ![]() OpenAI o3-mini Thinking Level · High | 60.40 | 2025-01-31 | Unknown | Closed |
23 | ![]() Qwen3-235B-A22B Standard Mode | 59.60 | 2025-04-28 | 235B | Free Commercial |
24 | ![]() Kimi K2 Standard Mode | 59.10 | 2025-07-11 | 1000B | Free Commercial |
25 | ![]() DeepSeek-R1 Thinking Enabled | 56.90 | 2025-01-20 | 671B | Free Commercial |
26 | ![]() Claude Sonnet 4 Standard Mode | 56.40 | 2025-05-23 | Unknown | Closed |
27 | ![]() DeepSeek-V3-0324 Standard Mode | 55.10 | 2025-03-24 | 671B | Free Commercial |
28 | 55.10 | 2025-04-17 | Unknown | Closed | |
29 | ![]() OpenAI o3-mini Thinking Level · Medium | 53.80 | 2025-01-31 | Unknown | Closed |
30 | Grok 3 Standard Mode | 53.30 | 2025-02-17 | Unknown | Closed |