Qwen3.6-27BvsGemini 3.0 Flash

Qwen3.6-27B and Gemini 3.0 Flash are tied across 6 shared benchmarks: Qwen3.6-27B leads on 3, Gemini 3.0 Flash leads on 3, with 0 ties and an average score difference of -1.88.

阿里巴巴
Qwen3.6-27B

阿里巴巴 · 2026-04-22 · Reasoning model

Google Deep Mind
Gemini 3.0 Flash

Google Deep Mind · 2025-12-17 · Chat model

Qwen3.6-27B3 wins(50%)(50%)3 winsGemini 3.0 Flash

Benchmark scores

Grouped by capability, sorted by largest gap within each. 6 shared benchmarks.

Coding and Software Engineer

Qwen3.6-27B 2/2
BenchmarkQwen3.6-27BGemini 3.0 FlashDiff
SWE-bench Verified77.2025 / 108Thinking (With Tools)68.7062 / 108+8.50
SWE-Bench Pro - Public53.5024 / 43Thinking (With Tools)49.6032 / 43Thinking High (With Tools)+3.90

General Knowledge

Gemini 3.0 Flash 2/2
BenchmarkQwen3.6-27BGemini 3.0 FlashDiff
HLE2492 / 157Thinking (No Tools)43.5038 / 157-19.50
GPQA Diamond87.8033 / 178Thinking (No Tools)90.4017 / 178-2.60

AI Agent - Tool Usage

Qwen3.6-27B 1/1
BenchmarkQwen3.6-27BGemini 3.0 FlashDiff
Terminal Bench 2.059.3020 / 46Thinking (With Tools)47.6037 / 46+11.70

Claw-style Agent Evaluation

Gemini 3.0 Flash 1/1
BenchmarkQwen3.6-27BGemini 3.0 FlashDiff
Claw Bench72.4027 / 29Thinking (With Tools)85.7015 / 29Thinking (With Tools)-13.30

Specs

FieldQwen3.6-27BGemini 3.0 Flash
Publisher阿里巴巴Google Deep Mind
Release date2026-04-222025-12-17
Model typeReasoning modelChat model
ArchitectureDenseDense
Parameters27BNot available
Context length128K2000K
Max output16K64K

Summary

  • Qwen3.6-27Bleads in:Coding and Software Engineer (2/2), AI Agent - Tool Usage (1/1)
  • Gemini 3.0 Flashleads in:General Knowledge (2/2), Claw-style Agent Evaluation (1/1)

On average across the 6 shared benchmarks, Gemini 3.0 Flash scores 1.88 higher.

Largest single-benchmark gap: HLE — Qwen3.6-27B 24 vs Gemini 3.0 Flash 43.50 (-19.50).

Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.