加载中...

LLM Benchmark Performance Comparison

Quickly view LLM performance across benchmarks like MMLU Pro, HLE, SWE-Bench, and more. Compare models across general knowledge, coding, and reasoning capabilities. Customize your comparison by selecting specific models and benchmarks.

Detailed benchmark descriptions available at:LLM Benchmark List & Guide

Updated on: 2025/11/08 22:10:24

Benchmark switcher

Pick the leaderboard to sync both chart and table

MMLU Pro GPQA Diamond SWE-bench Verified MATH-500 AIME 2024 LiveCodeBench

Filters

Active

Filter by parameter size

All 3B and below 7B 13B 34B

LLM Performance Results

Data source: DataLearnerAI

LLM Performance Results

Data source: DataLearnerAI

LiveCodeBench

Rank	Model	MMLU Pro	GPQA Diamond	SWE-bench Verified	MATH-500	AIME 2024	LiveCodeBench	Params (B)	License
1	Pangu Pro MoE	82.60	73.70	0.00	96.80	79.20	59.60	719B	Free commercial
2	Llama3.3-70B-Instruct	68.90	50.50	0.00