加载中...
Gemini 2.5 Pro Experimental 03-25 currently shows benchmark results led by AIME 2024 (9 / 62, score 92), GPQA Diamond (39 / 161, score 84), SimpleQA (12 / 45, score 52.90).