加载中...
加载中...
| o3-pro |
| 75.00 |
| 0.00 |
| 0.00 |
| 4 | M2.1 | 74.80 | 0.00 | 0.00 |
| 5 | Step 3.5 Flash | 74.40 | 86.40 | 0.00 |
| 6 | GLM-4.7 | 73.80 | 84.90 | 0.00 |
| 7 | DeepSeek V3.2 | 73.10 | 83.30 | 0.00 |
| 8 | Claude Opus 4 | 72.50 | 56.60 | 0.00 |
| 9 | Kimi K2 Thinking | 71.30 | 83.10 | 0.00 |
| 10 | Claude Sonnet 3.7 | 70.30 | 0.00 | 0.00 |
| 11 | MiniMax M2 | 69.40 | 83.00 | 0.00 |
| 12 | Kimi K2 0905 | 69.20 | 0.00 | 0.00 |
| 13 | DeepSeek-V3.1 Terminus | 68.40 | 80.00 | 0.00 |
| 14 | OpenAI o4 - mini | 68.10 | 0.00 | 0.00 |
| 15 | GLM-4.6 | 68.00 | 84.50 | 0.00 |
| 16 | DeepSeek V3.2-Exp | 67.80 | 74.10 | 0.00 |
| 17 | Qwen3-Coder-480B-A35B | 67.00 | 0.00 | 0.00 |
| 18 | DeepSeek-V3.1 | 66.00 | 74.80 | 0.00 |
| 19 | GLM-4.5 | 64.20 | 72.90 | 0.00 |
| 20 | Gemini-2.5-Pro-Preview-05-06 | 63.20 | 77.10 | 0.00 |
| 21 | DeepSeek-R1-0528 | 57.60 | 73.30 | 0.00 |
| 22 | GLM-4.5-Air | 57.60 | 70.70 | 0.00 |
| 23 | MiniMax-M1-80k | 56.00 | 65.00 | 0.00 |
| 24 | MiniMax-M1-40k | 55.60 | 62.30 | 0.00 |
| 25 | GPT-4.1 | 54.60 | 40.50 | 0.00 |
| 26 | Kimi K2 | 51.80 | 53.70 | 0.00 |
| 27 | Gemini 2.5 Flash | 50.00 | 55.40 | 0.00 |
| 28 | OpenAI o3-mini (high) | 49.30 | 69.50 | 97.60 |
| 29 | DeepSeek-R1 | 49.20 | 65.90 | 0.00 |
| 30 | OpenAI o1 | 48.90 | 71.00 | 0.00 |
| 31 | DeepSeek-V3-0324 | 38.80 | 49.20 | 0.00 |
| 32 | GPT-4.5 | 38.00 | 46.40 | 0.00 |
| 33 | Qwen3-235B-A22B | 34.40 | 70.70 | 0.00 |
| 34 | Gemini 2.5 Flash-Lite | 27.60 | 34.30 | 0.00 |
| 35 | GPT-4.1 mini | 23.60 | 0.00 | 0.00 |
| 36 | Gemini 2.0 Flash Experimental | 21.40 | 29.10 | 0.00 |
| 37 | Kimi-k1.6-IOI-high | 0.00 | 73.80 | 0.00 |
| 38 | Step3 | 0.00 | 67.10 | 0.00 |
| 39 | Grok 3 | 0.00 | 70.60 | 0.00 |
| 40 | OpenAI o3-mini (medium) | 0.00 | 67.40 | 0.00 |
| 41 | Gemini 2.0 Flash-Lite | 0.00 | 28.90 | 0.00 |
| 42 | Llama 4 Scout Instruct | 0.00 | 32.80 | 0.00 |
| 43 | ERNIE-4.5-300B-A47B | 0.00 | 38.80 | 0.00 |
| 44 | ERNIE-4.5-VL-424B-A47B-Base | 0.00 | 38.80 | 0.00 |
| 45 | Llama 4 Maverick Instruct | 0.00 | 43.40 | 0.00 |
| 46 | Llama 4 Behemoth Instruct | 0.00 | 49.40 | 0.00 |
| 47 | Qwen3-235B-A22B-2507 | 0.00 | 51.80 | 0.00 |
| 48 | Magistral-Medium-2506 | 0.00 | 59.36 | 0.00 |
| 49 | QwQ-Max-Preview | 0.00 | 65.60 | 0.00 |
| 50 | Kimi-k1.6-IOI | 0.00 | 65.90 | 0.00 |
| 51 | Qwen3-235B-A22B-Thinking-2507 | 0.00 | 74.10 | 0.00 |
| 52 | Grok-3 - Reasoning Beta | 0.00 | 79.40 | 0.00 |
| 53 | Gemini 2.5 Pro Deep Think | 0.00 | 80.40 | 0.00 |
| 54 | Qwen2.5-Max | 0.00 | 0.00 | 73.20 |
| 55 | Grok-1.5 | 0.00 | 0.00 | 74.10 |
| 56 | Codestral 25.01 | 0.00 | 37.90 | 86.60 |
| 57 | Grok 2 | 0.00 | 0.00 | 88.40 |
| 58 | DeepSeek-V3 | 0.00 | 34.60 | 89.00 |
| 59 | Amazon Nova Pro | 0.00 | 0.00 | 89.00 |
| 60 | Llama3.1-405B Instruct | 0.00 | 30.20 | 89.00 |
| 61 | GPT-4o(2024-11-20) | 0.00 | 0.00 | 90.20 |
| 62 | Hunyuan-TurboS | 0.00 | 32.00 | 91.00 |
| 63 | Claude 3.5 Sonnet | 0.00 | 0.00 | 92.00 |
| 64 | OpenAI o1-mini | 0.00 | 52.00 | 92.40 |