Claude Sonnet 4

Name: Claude Sonnet 4
Author: Anthropic

Reasoning modelSonnetClaude 4

Claude Sonnet 4

Release date: 2025-05-23Updated: 2025-10-19 12:24:142,104

Live demoGitHubHugging FaceCompare

Parameters

Not disclosed

Context length

200K

Chinese support

Supported

Reasoning ability

Claude Sonnet 4 is an AI model published by Anthropic, released on 2025-05-23, for Reasoning model, and 200K context length, under the 不开源 license, with a 1223.00 score on CodeClash.

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Claude Sonnet 4

Model basics

Reasoning traces

Supported

Thinking modes

Thinking modes not supported

Context length

200K tokens

Max output length

64K tokens

Model type

Reasoning model

Modality (in / out)

Text, Image → Text

Release date

2025-05-23

Model file size

No data

MoE architecture

Total params / Active params

No data / N/A

Knowledge cutoff

No data

Claude Sonnet 4

Open source & experience

Code license

不开源

Weights license

不开源

GitHub repo

GitHub link unavailable

Hugging Face

Hugging Face link unavailable

Live demo

No live demo

Claude Sonnet 4

Official resources

Paper

Introducing Claude 4

DataLearnerAI blog

Claude Sonnet 4

API details

API speed

4/5

No public API pricing yet.

Claude Sonnet 4

Benchmark Results

Claude Sonnet 4 currently shows benchmark results led by SWE-bench Verified (13 / 108, score 80.20), Terminal-Bench (10 / 35, score 41.30), MMLU Pro (37 / 126, score 84). This page also consolidates core specs, context limits, and API pricing so you can evaluate the model from benchmark results and deployment constraints together.

General Knowledge

12 evaluations

Benchmark / mode

Score

Rank/total

MMLU Pro

37 / 126

GPQA Diamond

83.80

58 / 179

GPQA Diamond

75.40

92 / 179

GPQA Diamond

123 / 179

LiveBench

Standard Mode

50.98

89 / 115

LiveBench

64K

61.27

65 / 115

ARC-AGI

46 / 65

ARC-AGI

23.80

53 / 65

HLE

9.60

136 / 159

HLE

5.52

150 / 159

ARC-AGI-2

5.90

43 / 59

ARC-AGI-2

1.30

52 / 59

Coding and Software Engineer

6 evaluations

Benchmark / mode

Score

Rank/total

CodeClash

Standard ModeTools

1223

4 / 8

SWE-bench Verified

80.20

13 / 108

SWE-bench Verified

72.70

47 / 108

LiveCodeBench

58 / 120

LiveCodeBench

48.50

94 / 120

SWE-Bench Pro - Public

42.70

38 / 44

Math and Reasoning

12 evaluations

Benchmark / mode

Score

Rank/total

AIME2025

50 / 106

AIME2025

70.50

71 / 106

AIME2025

95 / 106

AIME 2024

43.40

50 / 62

IMO-ProofBench

27.10

8 / 16

IMO 2024

9.70

5 / 10

IMO 2024

5.20

8 / 10

IMO-ProofBench Advanced

4.80

6 / 8

FrontierMath

4.10

41 / 60

IMO 2025

5 / 9

IMO 2025

3.30

6 / 9

FrontierMath - Tier 4

Standard Mode

72 / 80

Writing and Creative Capabilities

1 evaluations

Benchmark / mode

Score

Rank/total

Creative Writing

83.05

14 / 23

AI Agent - Tool Usage

4 evaluations

Benchmark / mode

Score

Rank/total

OSWorld-Verified

42.20

16 / 18

Terminal-Bench

41.30

10 / 35

Terminal-Bench

35.50

18 / 35

Terminal-Bench

26 / 35

Multimodal Understanding

1 evaluations

Benchmark / mode

Score

Rank/total

MMMU

76.50

16 / 28

常识推理

1 evaluations

Benchmark / mode

Score

Rank/total

Simple Bench

Thinking Mode

45.50

34 / 63

Agent Level Benchmark

4 evaluations

Benchmark / mode

Score

Rank/total

τ²-Bench - Telecom

29 / 35

Aider-Polyglot

Standard Mode

56.40

26 / 59

Aider-Polyglot

32K

61.30

20 / 59

τ²-Bench

33 / 40

Instruction Following

1 evaluations

Benchmark / mode

Score

Rank/total

IF Bench

22 / 29

Productivity Knowledge

1 evaluations

Benchmark / mode

Score

Rank/total

GDPval-AA

19 / 21

Long Context

1 evaluations

Benchmark / mode

Score

Rank/total

AA-LCR

10 / 13

Claw-style Agent Evaluation

2 evaluations

Benchmark / mode

Score

Rank/total

Pinch Bench

Thinking ModeTools

80.50

22 / 37

Claw Bench

Thinking ModeTools

77.80

23 / 29

View benchmark analysis Compare with other models

Compare with other models

No curated comparisons for this model yet.

Want a custom combination? Open the compare tool

Claude Sonnet 4

Publisher

Anthropic

View publisher details

Claude Sonnet 4

Model Overview

Claude Sonnet 4 is an AI model published by Anthropic, released on 2025-05-23, for Reasoning model, and 200K context length, under the 不开源 license, with a 1223.00 score on CodeClash.

DataLearner on WeChat

Follow DataLearner on WeChat for AI model updates and research notes.