Composer 2.5 Benchmark Details

Composer 2.5 currently shows benchmark results led by SWE-bench Multilingual (3 / 23, score 79.80), Terminal Bench 2.0 (7 / 47, score 69.30). This page also compares it with 3 competitor models and 3 predecessor or same-series models, including performance and pricing views when available. 1 source link is attached for reference.

Benchmark Results

Composer 2.5

Benchmark Results

AI Agent - Tool Usage

1 evaluations

Benchmark / mode

Score

Rank/total

Terminal Bench 2.0

Thinking Mode

69.30

7 / 47

Coding and Software Engineer

1 evaluations

Benchmark / mode

Score

Rank/total

SWE-bench Multilingual

Thinking Mode

79.80

3 / 23

Compare with other models

Competitor Comparison

Benchmark scores for Composer 2.5 compared against top models in its class

Composer 2.5Opus 4.7 GPT-5.5 Kimi K2.6

Benchmark categories:

The chart shows each model’s highest score per benchmark within the current filter. Out-of-100 benchmarks use raw heights; out-of-range benchmarks are scaled within that benchmark while labels keep the original scores.

2 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.

Benchmark	Composer 2.5Current	Opus 4.7	GPT-5.5	Kimi K2.6
Terminal Bench 2.0 AI Agent - 工具使用	69.30Thinking Enabled	69.40Extended Thinking ｜ Tools	82.70Thinking Level · High ｜ Tools	66.70Thinking Enabled ｜ Tools
SWE-bench Multilingual 编程与软件工程	79.80Thinking Enabled	--	--	76.70Thinking Enabled ｜ Tools

Standard API Pricing: Composer 2.5 vs. Peer Models

Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.

Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens

Model	Supplier	Standard input	Standard output	Base price applies to
Composer 2.5	Cursor	$0.5 / 1M tokens	$2.5 / 1M tokens	—
Opus 4.7	Anthropic	$5 / 1M tokens	$25 / 1M tokens	—
GPT-5.5	OpenAI	$5 / 1M tokens	$30 / 1M tokens	—
Kimi K2.6	Facebook AI研究实验室	$0.95 / 1M tokens	$4 / 1M tokens	—

Version History

How each version of the Composer 2.5 series stacks up on benchmark tests

Composer 2.5Composer 2 Composer 1.5 Composer 1

Benchmark categories:

2 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.

Benchmark	Composer 2.5Current	Composer 2	Composer 1.5	Composer 1
Terminal Bench 2.0 AI Agent - 工具使用	69.30Thinking Enabled	61.70Thinking Enabled	47.90Thinking Enabled	40.00Thinking Enabled
SWE-bench Multilingual 编程与软件工程	79.80Thinking Enabled	73.70Thinking Enabled	65.90Thinking Enabled	56.90Thinking Enabled

Single-Benchmark Version Trend

Viewing: Terminal Bench 2.0 · AI Agent - 工具使用

Benchmark

NormalNormal + ToolsThinkingThinking + ToolsDeepDeep + Tools

X-axis shows model and release date, Y-axis shows score; solid lines connect the same mode across versions, while dotted guides align modes within the same generation.

Standard API Pricing Across the Composer 2.5 Series

Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.

Source: DataLearnerAI. Standard text prices shown here use the default supplier. · USD / 1M tokens

Model	Supplier	Standard input	Standard output	Base price applies to
Composer 2.5	Cursor	$0.5 / 1M tokens	$2.5 / 1M tokens	—
Composer 2	Cursor	$0.5 / 1M tokens	$2.5 / 1M tokens	—
Composer 1.5	Cursor	$3.5 / 1M tokens	$17.5 / 1M tokens	—
Composer 1	Cursor	$1.25 / 1M tokens	$10 / 1M tokens	—

Sources

cursor.comcursor.com