Qwen 3.6 Plus PreviewvsQwen3.5-397B-A17B

Across 14 shared benchmarks, Qwen 3.6 Plus Preview leads overall: Qwen 3.6 Plus Preview wins 12, Qwen3.5-397B-A17B wins 2, with 0 ties and an average score difference of +2.59.

阿里巴巴
Qwen 3.6 Plus Preview

阿里巴巴 · 2026-03-31 · Chat model

阿里巴巴
Qwen3.5-397B-A17B

阿里巴巴 · 2026-02-16 · Multimodal model

Qwen 3.6 Plus Preview12 wins(86%)(14%)2 winsQwen3.5-397B-A17B

Benchmark scores

Grouped by capability, sorted by largest gap within each. 14 shared benchmarks.

Coding and Software Engineer

Qwen 3.6 Plus Preview 4/4
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
SWE-Bench Pro - Public56.6013 / 43Thinking (With Tools)50.9029 / 43Thinking (No Tools)+5.70
SWE-bench Multilingual73.807 / 20Thinking (No Tools)69.3017 / 20Thinking (No Tools)+4.50
LiveCodeBench87.1010 / 120Thinking (No Tools)83.6020 / 120Thinking (No Tools)+3.50
SWE-bench Verified78.8020 / 108Thinking (With Tools)76.4029 / 108Thinking (With Tools)+2.40

General Knowledge

Qwen 3.6 Plus Preview 4/4
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
HLE50.6017 / 157Thinking (With Tools)48.3028 / 157Thinking (With Tools + Internet)+2.30
GPQA Diamond90.4017 / 178Thinking (No Tools)88.4026 / 178Thinking (No Tools)+2
MMLU Pro88.505 / 126Thinking (No Tools)87.8010 / 126Thinking (No Tools)+0.70
C-Eval93.302 / 9Thinking (No Tools)933 / 9Thinking (No Tools)+0.30

AI Agent - Tool Usage

Qwen 3.6 Plus Preview 2/2
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
Terminal Bench 2.061.6016 / 46Thinking (With Tools)52.5029 / 46Thinking (With Tools)+9.10
Tool Decathlon39.804 / 7Thinking (With Tools)38.305 / 7Thinking (With Tools)+1.50

Math and Reasoning

Qwen 3.6 Plus Preview 2/2
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
AIME 202695.302 / 14Thinking (No Tools)91.3011 / 14Thinking (No Tools)+4
IMO-AnswerBench83.8010 / 19Thinking (No Tools)80.9015 / 19Thinking (No Tools)+2.90

Instruction Following

Qwen3.5-397B-A17B 1/1
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
IF Bench74.206 / 29Thinking (No Tools)76.503 / 29Thinking (No Tools)-2.30

Long Context

Qwen3.5-397B-A17B 1/1
BenchmarkQwen 3.6 Plus PreviewQwen3.5-397B-A17BDiff
AA-LCR68.306 / 13Thinking (No Tools)68.705 / 13Thinking (No Tools)-0.40

Specs

FieldQwen 3.6 Plus PreviewQwen3.5-397B-A17B
Publisher阿里巴巴阿里巴巴
Release date2026-03-312026-02-16
Model typeChat modelMultimodal model
ArchitectureDenseMoE
ParametersNot available39.7B
Context length1M256K
Max output64KNot available

API pricing

Prices use DataLearner records when available; missing fields are not inferred.

ItemQwen 3.6 Plus PreviewQwen3.5-397B-A17B
Text input$0.5 / 1M tokens$0.5 / 1M tokens
Text output$3 / 1M tokens$3 / 1M tokens
Cache read$0.05 / 1M tokens$0.05 / 1M tokens
Cache write$0.625 / 1M tokens$0.625 / 1M tokens

Summary

  • Qwen 3.6 Plus Previewleads in:Coding and Software Engineer (4/4), General Knowledge (4/4), AI Agent - Tool Usage (2/2), Math and Reasoning (2/2)
  • Qwen3.5-397B-A17Bleads in:Instruction Following (1/1), Long Context (1/1)

On average across the 14 shared benchmarks, Qwen 3.6 Plus Preview scores 2.59 higher.

Largest single-benchmark gap: Terminal Bench 2.0 — Qwen 3.6 Plus Preview 61.60 vs Qwen3.5-397B-A17B 52.50 (+9.10).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.