Qwen3-235B-A22B-Thinking Benchmark Details

Qwen3-235B-A22B-Thinking currently shows benchmark results led by Creative Writing (5 / 23, score 86.10), MMLU Pro (34 / 126, score 84.40), AIME2025 (33 / 106, score 92.30).

Benchmark Results

Qwen3-235B-A22B-Thinking

Benchmark Results

General Knowledge

4 evaluations

Benchmark / mode

Score

Rank/total

MMLU Pro

84.40

34 / 126

GPQA Diamond

81.10

68 / 179

LiveBench

Thinking Enabled

52.97

86 / 115

HLE

18.20

111 / 159

Coding and Software Engineer

1 evaluations

Benchmark / mode

Score

Rank/total

LiveCodeBench

74.10

41 / 120

Math and Reasoning

3 evaluations

Benchmark / mode

Score

Rank/total

AIME2025

92.30

33 / 106

IMO-ProofBench

33.30

6 / 16

IMO-ProofBench Advanced

5.20

5 / 8

Writing and Creative Capabilities

1 evaluations

Benchmark / mode

Score

Rank/total

Creative Writing

86.10

5 / 23

Compare with other models