Qwen3.5-27BvsQwen3-32B

Across 4 shared benchmarks, Qwen3.5-27B leads overall: Qwen3.5-27B wins 4, Qwen3-32B wins 0, with 0 ties and an average score difference of +158.37.

阿里巴巴 · 2026-02-25 · Reasoning model

阿里巴巴 · 2025-04-28 · Reasoning model

Qwen3.5-27B4 wins(100%)(0%)0 winsQwen3-32B

Benchmark scores

Grouped by capability, sorted by largest gap within each. 4 shared benchmarks.

Qwen3.5-27B 2/2

Benchmark	Qwen3.5-27B	Qwen3-32B	Diff
CodeForces	1,89915 / 16Thinking (No Tools)	1,35316 / 16Normal (No Tools)	+546
LiveCodeBench	80.7027 / 123Thinking (With Tools)	31.30117 / 123Normal (No Tools)	+49.40

Qwen3.5-27B 2/2

Benchmark	Qwen3.5-27B	Qwen3-32B	Diff
GPQA Diamond	85.5052 / 187Thinking (No Tools)	54.60155 / 187Normal (No Tools)	+30.90
C-Eval	90.506 / 10Thinking (No Tools)	83.3010 / 10Normal (No Tools)	+7.20

Prices use DataLearner records when available; missing fields are not inferred.

Item	Qwen3.5-27B	Qwen3-32B
Text input	Not public	¥0.0012 / 1K tokens
Text output	Not public	¥0.0048 / 1K tokens

One or both models have incomplete public pricing.

On average across the 4 shared benchmarks, Qwen3.5-27B scores 158.37 higher.

Largest single-benchmark gap: CodeForces — Qwen3.5-27B 1,899 vs Qwen3-32B 1,353 (+546).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.