Claude Sonnet 3.7 Benchmark Details
Claude Sonnet 3.7 currently shows benchmark results led by Aider-Polyglot (18 / 59, score 64.90), Simple Bench (31 / 63, score 46.40), GPQA Diamond (89 / 179, score 77). 1 source link is attached for reference.
Benchmark Results
Claude Sonnet 3.7
Benchmark Results
General Knowledge
3 evaluationsBenchmark / mode
Score
Rank/total
Coding and Software Engineer
2 evaluationsBenchmark / mode
Score
Rank/total
Math and Reasoning
5 evaluationsBenchmark / mode
Score
Rank/total
Agent Level Benchmark
5 evaluationsBenchmark / mode
Score
Rank/total