加载中...
加载中...
GPT-4 currently shows benchmark results led by MMLU (30 / 63, score 86.40), HumanEval (27 / 38, score 67), DROP (7 / 7, score 80.90). This page also compares it with 1 competitor models and 2 predecessor or same-series models, including performance and pricing views when available. 1 source link is attached for reference.
Side-by-side benchmark comparison of GPT-4 against leading peer models
Vertical view
Top 3 benchmarks with comparable scores
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.
Track the evolution of the GPT-4 series across generations
Shows standard text input and output pricing side by side for each model. If extended-context pricing exists, the chart keeps the base rate and explains the threshold below.
Source: DataLearnerAI. Standard text prices shown here use the default supplier.