Claude Sonnet 3.7 Benchmark Details
Claude Sonnet 3.7 currently shows benchmark results led by LiveBench (24 / 52, score 68.64), SWE-bench Verified (50 / 103, score 70.30), GPQA Diamond (85 / 175, score 77). 1 source link is attached for reference.
Benchmark Results
Claude Sonnet 3.7
Benchmark Results
综合评估
5 evaluationsBenchmark / mode
Score
Rank/total
编程与软件工程
2 evaluationsBenchmark / mode
Score
Rank/total
数学推理
5 evaluationsBenchmark / mode
Score
Rank/total
Agent能力评测
5 evaluationsBenchmark / mode
Score
Rank/total