ARC

Updated Jul 22, 2025·1,169 views

Problem Count: 7787
Institution: Allen Institute for AI
Category: Commonsense Reasoning
Metrics: Accuracy
Language: English
Difficulty: Advanced

Overview

A benchmark of 7,787 multiple-choice science questions used to evaluate scientific and commonsense reasoning.

Related resources

View Paper
Get Dataset
Official Website

Latest ARC model rankings and full benchmark leaderboard

Browse the latest scores, model modes, release dates, and parameter sizes for ARC.

Source: DataLearnerAI

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model Mode Legend

License:

Origin:

Model release cutoff:

Rank	Model				License
	Gemma 2 - 9B Standard Mode	68.20	2024-06-27	9B	Free Commercial
	Qwen2.5-7B Standard Mode	63.70	2024-09-18	7B	Free Commercial
	Mistral-7B-Instruct-v0.3 Standard Mode	60.00	2024-05-22	7B	Free Commercial
4	Llama3.1-8B Standard Mode	59.30	2024-07-23	8B	Free Commercial

Latest ARC model rankings and full benchmark leaderboard

ARC Rank