Claude 3.5 Sonnet New Benchmark Details
Claude 3.5 Sonnet New currently shows benchmark results led by HumanEval (3 / 39, score 93.70), BBH (2 / 20, score 92.60), MMLU (17 / 64, score 88.30).
Benchmark Results
Claude 3.5 Sonnet New
Benchmark Results
综合评估
4 evaluationsBenchmark / mode
Score
Rank/total
编程与软件工程
3 evaluationsBenchmark / mode
Score
Rank/total
数学推理
5 evaluationsBenchmark / mode
Score
Rank/total