Phi-4-instruct (reasoning-trained) Benchmark Details
Phi-4-instruct (reasoning-trained) currently shows benchmark results led by AIME 2024 (47 / 62, score 50), MATH-500 (36 / 43, score 90.40), GPQA Diamond (143 / 166, score 49).
Benchmark Results
Phi-4-instruct (reasoning-trained)