Claude Sonnet 4 Benchmark Details
Claude Sonnet 4 currently shows benchmark results led by SWE-bench Verified (9 / 103, score 80.20), LiveBench (11 / 52, score 73.82), MMLU Pro (35 / 124, score 84). 1 source link is attached for reference.
Benchmark Results
Claude Sonnet 4
Benchmark Results
综合评估
12 evaluationsBenchmark / mode
Score
Rank/total
编程与软件工程
5 evaluationsBenchmark / mode
Score
Rank/total
数学推理
12 evaluationsBenchmark / mode
Score
Rank/total
AI Agent - 工具使用
4 evaluationsBenchmark / mode
Score
Rank/total
Agent能力评测
3 evaluationsBenchmark / mode
Score
Rank/total
OpenClaw智能体能力综合测评
2 evaluationsBenchmark / mode
Score
Rank/total