DataLearner logoDataLearnerAI
Latest AI Insights
Model Leaderboards
Benchmarks
Model Directory
Model Comparison
Resource Center
Tools
LanguageEnglish
DataLearner logoDataLearner AI

A knowledge platform focused on LLM benchmarking, datasets, and practical instruction with continuously updated capability maps.

Products

  • Leaderboards
  • Model comparison
  • Datasets

Resources

  • Tutorials
  • Editorial
  • Tool directory

Company

  • About
  • Privacy policy
  • Data methodology
  • Contact

© 2026 DataLearner AI. DataLearner curates industry data and case studies so researchers, enterprises, and developers can rely on trustworthy intelligence.

Privacy policyTerms of service
HomeOverall LeaderboardText-to-Video Arena Leaderboard

LMArena Tracks

Text GenerationCodingMathImage EditText-to-VideoImage-to-VideoText-to-Image

Text-to-Video Arena Leaderboard

The latest AI video generation leaderboard based on Text-to-Video Arena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for leading video models.

Top Model

happyhorse-1.0

Top Score

1,435

Model Count

39

Data version

2026年05月12日

Data source: LM Arena

About This Leaderboard

This leaderboard ranks AI text-to-video models by generation quality. Data comes from LMArena's Text-to-Video Arena track, evaluated through anonymous blind testing by real users.

Methodology Overview

Blind testing: Users submit text descriptions, two anonymous models generate videos, and users vote for the better result.

Elo scoring: Based on the Bradley-Terry model. Higher scores indicate stronger user preference for that model's video output.

Diverse generation scenarios: Covers natural landscapes, human motion, creative animation, product showcases, and more.

DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.

Origin:AllChina
Leaderboard snapshot month:

Ranking Table

RankModelScore95% CIVotesOrganizationLicense
Alibaba-ATHhappyhorse-1.0Alibaba-ATH1,435+/-96,266Alibaba-ATHProprietary
10AlibabaWan2.6 T2VAlibaba1,341+/-1124,738AlibabaProprietary
24MiniMaxAIHailuo 2.3MiniMaxAI1,199+/-129,370MiniMaxAIProprietary
25MiniMaxAIHailuo 2.3MiniMaxAI1,199+/-750,014MiniMaxAIProprietary
27MiniMaxAIHailuo 2.3MiniMaxAI1,181+/-129,333MiniMaxAIProprietary

Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.

2026-05 Market Signals

Current Best (SOTA)

01

Veo 3.1 Audio 1080p

02

Veo 3.1 Fast-Audio 1080p

03

Sora-2-Pro

Best China Model

Wan2.6-T2V

Seedance-V1.5-Pro

Kling-2.6-Pro

Best Open Model

  • •Wan-V2.2-A14B
  • •Kandinsky-5.0-T2V-Pro
  • •Mochi-V1

FAQ

01

How does Text-to-Video Arena rank models?

Rankings are based on side-by-side anonymous votes. Users enter the same prompt, compare outputs from two hidden models, and choose the better video. Elo-style scoring then aggregates those comparisons into a leaderboard.

02

What is audio-video sync, and why does it matter?

Audio-video sync means generated sound effects or speech match the motion and timing in the video. It matters because synchronized audio can make generated clips usable with less post-production work.

03

What use cases are text-to-video models good for?

Common uses include short-form video creation, marketing assets, e-commerce product clips, storyboarding, game cinematics, and educational demos.

04

Which models support the longest generation length?

Long generation limits change quickly by product tier and release. In practice, check the current model documentation and compare both maximum duration and quality consistency across longer clips.