DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
  1. Back to benchmark list
  2. /
  3. HellaSwag

HellaSwag

更新于 2026-04-03
1,030 次浏览
问题数量
70000
发布机构
University of Washington
评测类别
常识推理
评测指标
Accuracy
支持语言
英语
难度等级
Intermediate

简介

一个包含 70,000 个多项选择题的基准,用于评估模型的常识推理能力。

相关资源

查看原始论文
阅读学术论文原文
获取数据集
下载评测数据集
访问官网
浏览项目官方网站

HellaSwag Model Score Leaderboard

Source: DataLearnerAI

Data sourced primarily from official releases (GitHub, Hugging Face, papers), then benchmark leaderboards, then third-party evaluators. Learn about our data methodology

模式说明:
normal
thinking
low
medium
high
deeper thinking
parallel_thinking
图表加载中...

Latest HellaSwag model rankings and full benchmark leaderboard

Browse the latest scores, model modes, release dates, and parameter sizes for HellaSwag.

Model release cutoff:

HellaSwag详细排名数据表格

排名模型
1
Claude3-Opus
Standard Mode
95.402024-03-04Unknown
2
Gemma2-27B
Standard Mode
86.402024-05-1427B
3
Gemma 3 - 27B (IT)
Standard Mode
85.602025-03-1227B