Benchmark Results
Benchmark Results
Version History
How each version of the Grok 4.1 series stacks up on benchmark tests
1 benchmarks with comparable scores. Each model shows its best score; mode label is displayed below.· Click a row to view its trend chart.
| Benchmark | Grok 4.1Current | GPT-4o(2024-11-20) | GPT-4o |
|---|---|---|---|
SWE-bench Verified 编程与软件工程 | 54.60Standard Mode | 31.00Standard Mode | 31.00Standard Mode |
Single-Benchmark Version Trend
Viewing: SWE-bench Verified · 编程与软件工程
Standard API Pricing Across the Grok 4.1 Series
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
These models use different currencies or billing units, so the page falls back to raw price values instead of a shared bar chart.
| Model | Supplier | Standard input | Standard output | Base price applies to |
|---|---|---|---|---|
GPT-4o | — | 2.5 美元/100万 tokens | 10 美元/100万 tokens | — |