Qwen3.6-27BvsGPT-5.4 mini
Across 5 shared benchmarks, GPT-5.4 mini leads overall: Qwen3.6-27B wins 0, GPT-5.4 mini wins 5, with 0 ties and an average score difference of -4.44.
Qwen3.6-27B
阿里巴巴 · 2026-04-22 · Reasoning model
GPT-5.4 mini
OpenAI · 2026-03-17 · Reasoning model
Qwen3.6-27B0 wins(0%)(100%)5 winsGPT-5.4 mini
Benchmark scores
Grouped by capability, sorted by largest gap within each. 5 shared benchmarks.
General Knowledge
GPT-5.4 mini 2/2| Benchmark | Qwen3.6-27B | GPT-5.4 mini | Diff |
|---|---|---|---|
| HLE | 2484 / 149Thinking (No Tools) | 41.5041 / 149极高强度思考(工具) | -17.50 |
| GPQA Diamond | 87.8030 / 175Thinking (No Tools) | 8829 / 175极高强度思考(无工具) | -0.20 |
AI Agent - Tool Usage
GPT-5.4 mini 1/1| Benchmark |
|---|
Specs
| Field | Qwen3.6-27B | GPT-5.4 mini |
|---|---|---|
| Publisher | 阿里巴巴 | OpenAI |
| Release date | 2026-04-22 | 2026-03-17 |
| Model type | Reasoning model | Reasoning model |
| Architecture | Dense | Dense |
| Parameters | 270.0 | 0.0 |
| Context length | 128K | 400K |
| Max output | 16384 | 131072 |
API pricing
Prices use DataLearner records when available; missing fields are not inferred.
| Item | Qwen3.6-27B | GPT-5.4 mini |
|---|---|---|
| Text input | Not public | $0.75 / 1M tokens |
| Text output | Not public | $4.5 / 1M tokens |
| Cache read | Not public | $4.5 / 1M tokens |
| Cache write | Not public | $0.075 / 1M tokens |
One or both models have incomplete public pricing.
Summary
- GPT-5.4 minileads in:General Knowledge (2/2), AI Agent - Tool Usage (1/1), Claw-style Agent Evaluation (1/1), Coding and Software Engineer (1/1)
On average across the 5 shared benchmarks, GPT-5.4 mini scores 4.44 higher.
Largest single-benchmark gap: HLE — Qwen3.6-27B 24 vs GPT-5.4 mini 41.50 (-17.50).
Page generated from structured model, pricing and benchmark records. No real-time LLM is used to write the prose.