Qwen3.6-27BvsQwen3.5-27B
Across 8 shared benchmarks, Qwen3.6-27B leads overall: Qwen3.6-27B wins 6, Qwen3.5-27B wins 2, with 0 ties and an average score difference of +0.21.
Qwen3.6-27B
阿里巴巴 · 2026-04-22 · Reasoning model
Qwen3.5-27B
阿里巴巴 · 2026-02-25 · Reasoning model
Qwen3.6-27B6 wins(75%)(25%)2 winsQwen3.5-27B
Benchmark scores
Grouped by capability, sorted by largest gap within each. 8 shared benchmarks.
General Knowledge
Qwen3.6-27B 3/4| Benchmark | Qwen3.6-27B | Qwen3.5-27B | Diff |
|---|---|---|---|
| HLE | 2484 / 149Thinking (No Tools) | 48.5021 / 149Thinking (With Tools) | -24.50 |
| GPQA Diamond | 87.8030 / 175Thinking (No Tools) | 85.5044 / 175Thinking (No Tools) | +2.30 |
| C-Eval | 91.405 / 9Thinking (No Tools) | 90.506 / 9 |
Specs
| Field | Qwen3.6-27B | Qwen3.5-27B |
|---|---|---|
| Publisher | 阿里巴巴 | 阿里巴巴 |
| Release date | 2026-04-22 | 2026-02-25 |
| Model type | Reasoning model | Reasoning model |
| Architecture | Dense | Dense |
| Parameters | 270.0 | 270.0 |
| Context length | 128K | 1010K |
| Max output | 16384 | 248320 |
Summary
- Qwen3.6-27Bleads in:General Knowledge (3/4), Coding and Software Engineer (2/2), AI Agent - Tool Usage (1/1)
- Qwen3.5-27Bleads in:Claw-style Agent Evaluation (1/1)
On average across the 8 shared benchmarks, Qwen3.6-27B scores 0.21 higher.
Largest single-benchmark gap: HLE — Qwen3.6-27B 24 vs Qwen3.5-27B 48.50 (-24.50).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.