Benchmark Results
Benchmark Results
综合评估
4 evaluations编程与软件工程
4 evaluationsCompetitor Comparison
Benchmark scores for Qwen3.7-Max-Preview compared against top models in its class
9 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.
| Benchmark | Qwen3.7-Max-PreviewCurrent | Kimi K2.6 | DeepSeek-V4-Pro | GLM 5.1 |
|---|---|---|---|---|
GPQA Diamond 综合评估 | 92.40Thinking Level · High | 90.50Thinking Enabled | 90.10Thinking Level · High | 86.20Thinking Enabled |
HLE 综合评估 | 53.50Thinking Enabled | Tools | 54.00Thinking Enabled | Tools | 48.20Thinking Level · Extra High | Tools | 52.30Thinking Enabled | Tools |
MMLU Pro 综合评估 | 89.60Thinking Level · High | -- | 87.50Thinking Level · High | -- |
LiveCodeBench 编程与软件工程 | 91.60Thinking Level · High | 89.60Thinking Enabled | 93.50Thinking Level · High | -- |
SWE-bench Multilingual 编程与软件工程 | 78.30Thinking Enabled | Tools | 76.70Thinking Enabled | Tools | 76.20Thinking Level · Extra High | Tools | -- |
SWE-Bench Pro - Public 编程与软件工程 | 60.60Thinking Enabled | Tools | 58.60Thinking Enabled | Tools | 55.40Thinking Level · Extra High | Tools | 58.40Thinking Enabled | Tools |
SWE-bench Verified 编程与软件工程 | 80.40Thinking Enabled | Tools | 80.20Thinking Enabled | Tools | 80.60Thinking Level · Extra High | Tools | -- |
Terminal Bench 2.0 AI Agent - 工具使用 | 69.70Thinking Enabled | Tools | 66.70Thinking Enabled | Tools | 67.90Thinking Level · Extra High | Tools | 63.50Thinking Enabled | Tools |
IMO-AnswerBench 数学推理 | 90.00Thinking Level · High | 86.00Thinking Enabled | 89.80Thinking Level · High | 83.80Thinking Enabled |
Standard API Pricing: Qwen3.7-Max-Preview vs. Peer Models
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Qwen3.7-Max-Preview | 阿里巴巴 | $2.5 / 1M tokens | $7.5 / 1M tokens | — |
Kimi K2.6 | Facebook AI研究实验室 | $0.95 / 1M tokens | $4 / 1M tokens | — |
DeepSeek-V4-Pro | DeepSeek-AI | $1.74 / 1M tokens | $3.48 / 1M tokens | — |
GLM 5.1 | 智谱AI | $1.4 / 1M tokens | $4.4 / 1M tokens | — |
Version History
How each version of the Qwen3.7-Max-Preview series stacks up on benchmark tests
10 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | Qwen3.7-Max-PreviewCurrent | Qwen3.6-Max-Preview | Qwen3-Max-Thinking |
|---|---|---|---|
GPQA Diamond 综合评估 | 92.40Thinking Level · High | 90.40Thinking Level · High | 87.40Thinking Enabled |
HLE 综合评估 | 53.50Thinking Enabled | Tools | 50.20Thinking Enabled | Tools | 49.80Thinking Enabled | Tools |
MMLU Pro 综合评估 | 89.60Thinking Level · High | 88.50Thinking Level · High | 85.70Thinking Enabled |
LiveCodeBench 编程与软件工程 | 91.60Thinking Level · High | 87.10Thinking Level · High | 85.90Thinking Enabled |
SWE-bench Multilingual 编程与软件工程 | 78.30Thinking Enabled | Tools | 73.80Thinking Enabled | Tools | -- |
SWE-Bench Pro - Public 编程与软件工程 | 60.60Thinking Enabled | Tools | 56.60Thinking Enabled | Tools | -- |
SWE-bench Verified 编程与软件工程 | 80.40Thinking Enabled | Tools | 78.80Thinking Enabled | Tools | 75.30Thinking Enabled |
IF Bench 指令跟随 | 79.10Thinking Level · High | 74.20Thinking Level · High | 70.90Thinking Enabled | Tools |
Terminal Bench 2.0 AI Agent - 工具使用 | 69.70Thinking Enabled | Tools | 65.40Deep Thinking Mode | Tools | -- |
IMO-AnswerBench 数学推理 | 90.00Thinking Level · High | 83.80Thinking Level · High | 83.90Thinking Enabled |
Single-Benchmark Version Trend
Viewing: GPQA Diamond · 综合评估
Standard API Pricing Across the Qwen3.7-Max-Preview Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens
When a context threshold exists, the charted base price only applies within these limits:
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
Qwen3.7-Max-Preview | 阿里巴巴 | $2.5 / 1M tokens | $7.5 / 1M tokens | — |
Qwen3.6-Max-Preview | 阿里巴巴 | $1.3 / 1M tokens | $7.8 / 1M tokens | <= 128 |
Qwen3-Max-Thinking | — | 1.2 美元/100万 tokens | 6 美元/100万 tokens | <= 32K |