Pinch Bench is an AI benchmark used to evaluate model capabilities. Review its overview, metrics, official resources, and model leaderboard results on DataLearnerAI.
Browse the latest scores, model modes, release dates, and parameter sizes for Pinch Bench.
Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology
| Rank | Model | License | |||
|---|---|---|---|---|---|
![]() GPT-5.4 Thinking EnabledTools | 90.50 | 2026-03-05 | Unknown | Closed | |
![]() Qwen3.5-27B Thinking EnabledTools | 90.00 | 2026-02-25 | 27B | Free Commercial | |
![]() Qwen3.5-397B-A17B Thinking EnabledTools | 89.10 | 2026-02-16 | 39.7B | Free Commercial | |
4 | ![]() Claude Sonnet 4.5 Thinking EnabledTools | 88.20 | 2025-09-30 | Unknown | Closed |
5 | ![]() Claude Sonnet 4.6 Thinking EnabledTools | 88.00 | 2026-02-17 | Unknown | Closed |
6 | ![]() MiniMax M2.5 Thinking EnabledTools | 87.80 | 2026-02-12 | 229B | Free Commercial |
7 | ![]() Claude Opus 4.6 Thinking EnabledTools | 87.40 | 2026-02-05 | Unknown | Closed |
8 | ![]() Opus 4.5 Extended ThinkingTools | 87.20 | 2025-11-25 | Unknown | Closed |
9 | ![]() MiniMax-M2.7 Thinking EnabledTools | 87.10 | 2026-03-18 | 229B | Non-Commercial |
10 | ![]() Gemini 3.1 Pro Preview Thinking EnabledTools | 86.70 | 2026-02-20 | Unknown | Closed |
11 | ![]() GLM-5-Turbo Thinking EnabledTools | 86.50 | 2026-03-15 | Unknown | Closed |
12 | ![]() GLM-5 Thinking EnabledTools | 86.40 | 2026-02-11 | 744B | Free Commercial |
13 | ![]() GLM-4.5-Air Thinking EnabledTools | 85.70 | 2025-07-28 | 106B | Free Commercial |
14 | ![]() Qwen3.5-122B-A10B Thinking EnabledTools | 85.50 | 2026-02-25 | 122B | Free Commercial |
15 | ![]() Step 3.5 Flash Thinking EnabledTools | 85.30 | 2026-02-02 | 196B | Free Commercial |
16 | ![]() Gemini 3.0 Flash Thinking EnabledTools | 85.20 | 2025-12-17 | Unknown | Closed |
17 | ![]() Kimi K2.5 Thinking EnabledTools | 84.80 | 2026-01-27 | 1000B | Free Commercial |
18 | ![]() DeepSeek V3.2 Thinking EnabledTools | 84.30 | 2025-12-01 | 671B | Free Commercial |
19 | ![]() M2.1 Thinking EnabledTools | 84.30 | 2025-12-23 | 230B | Free Commercial |
20 | Grok 4.1 Fast Thinking EnabledTools | 82.40 | 2025-11-19 | Unknown | Closed |
21 | ![]() Haiku 4.5 Thinking EnabledTools | 82.00 | 2025-10-15 | Unknown | Closed |
22 | ![]() Claude Sonnet 4 Thinking EnabledTools | 80.50 | 2025-05-23 | Unknown | Closed |
23 | ![]() GPT-5-mini Thinking EnabledTools | 80.30 | 2025-08-07 | Unknown | Closed |
24 | ![]() Qwen3-Max-Thinking Thinking EnabledTools | 80.30 | 2026-01-26 | 1000B | Closed |
25 | ![]() Qwen3-Coder-Next Thinking EnabledTools | 79.10 | 2026-02-03 | 8B | Free Commercial |
26 | ![]() Qwen3.5-35B-A3B Thinking EnabledTools | 78.40 | 2026-02-25 | 35B | Free Commercial |
27 | ![]() GPT-4o mini Thinking EnabledTools | 75.00 | 2024-07-18 | Unknown | Closed |
28 | ![]() Mistral Large 3 Thinking EnabledTools | 72.20 | 2025-12-02 | 675B | Free Commercial |
29 | ![]() Gemini 2.5 Pro Experimental 03-25 Thinking EnabledTools | 71.90 | 2025-03-25 | Unknown | Closed |
30 | ![]() GPT-4o Thinking EnabledTools | 71.10 | 2024-05-13 | Unknown | Closed |