BIG-bench

1,117 views

Problem Count: 200
Institution: Google
Category: General Evaluation
Metrics: Varies
Language: Multilingual
Difficulty: Advanced

Overview

A broad benchmark with more than 200 tasks covering reasoning, language understanding, knowledge, and other model capabilities.

Related resources

View Paper
Get Dataset
Official Website

Latest BIG-bench model rankings and full benchmark leaderboard

Browse the latest scores, model modes, release dates, and parameter sizes for BIG-bench.

Source: DataLearnerAI

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

Model Mode Legend

License:

Origin:

Model release cutoff:

No benchmark data available yet