加载中...
加载中...
Quickly view LLM performance across benchmarks like MMLU Pro, HLE, SWE-Bench, and more. Compare models across general knowledge, coding, and reasoning capabilities. Customize your comparison by selecting specific models and benchmarks.
Detailed benchmark descriptions available at:LLM Benchmark List & Guide
Benchmark switcher
Pick the leaderboard to sync both chart and table
Data source: DataLearnerAI
| 0.00 |
| 0.00 |
| 43.20 |
| 3 | QwQ-32B | 76.00 | 58.00 | 0.00 | 91.00 | 79.50 | 0.00 |
| 4 | GPT OSS 20B | 74.00 | 71.50 | 34.00 | 0.00 | 96.00 | 0.00 |
| 5 | QwQ-32B-Preview | 70.97 | 0.00 | 0.00 | 90.60 | 50.00 | 0.00 |
| 6 | Qwen2.5-32B | 69.23 | 0.00 | 0.00 | 0.00 | 0.00 | 51.20 |
| 7 | Qwen3-30B-A3B | 69.10 | 54.80 | 0.00 | 0.00 | 0.00 | 29.00 |
| 8 | Mistral-Small-3.2 | 69.06 | 46.13 | 0.00 | 0.00 | 0.00 | 0.00 |
| 9 | Gemma 3 - 27B (IT) | 67.50 | 42.40 | 0.00 | 0.00 | 25.30 | 29.70 |
| 10 | Mistral-Small-3.1-24B-Instruct-2503 | 66.76 | 45.96 | 0.00 | 0.00 | 0.00 | 0.00 |
| 11 | Gemma2-27B | 56.54 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 12 | C4AI Aya Vision 32B | 47.16 | 33.84 | 0.00 | 0.00 | 0.00 | 0.00 |
| 13 | GLM-4.7-Flash | 0.00 | 75.20 | 59.20 | 0.00 | 0.00 | 0.00 |
| 14 | Qwen3-32B | 0.00 | 68.40 | 0.00 | 97.20 | 81.40 | 65.70 |
| 15 | Magistral-Small-2506 | 0.00 | 68.18 | 0.00 | 0.00 | 70.68 | 55.84 |
| 16 | Devstral Small 1.1 | 0.00 | 0.00 | 53.60 | 0.00 | 0.00 | 0.00 |
| 17 | Qwen3-Coder-Flash | 0.00 | 0.00 | 51.60 | 0.00 | 0.00 | 0.00 |
| 18 | Devstral Small 1.0 | 0.00 | 0.00 | 46.80 | 0.00 | 0.00 | 0.00 |
| 19 | Codestral | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 31.50 |