Text-to-Video Arena Leaderboard
The latest AI video generation leaderboard based on Text-to-Video Arena anonymous user voting. Covers Elo scores, confidence intervals, and vote counts for leading video models.
Top Model
Seedance 2.0
Top Score
1,457
Model Count
39
Data version
2026年05月12日
Data source: LM Arena
About This Leaderboard
This leaderboard ranks AI text-to-video models by generation quality. Data comes from LMArena's Text-to-Video Arena track, evaluated through anonymous blind testing by real users.
Methodology Overview
Blind testing: Users submit text descriptions, two anonymous models generate videos, and users vote for the better result.
Elo scoring: Based on the Bradley-Terry model. Higher scores indicate stronger user preference for that model's video output.
Diverse generation scenarios: Covers natural landscapes, human motion, creative animation, product showcases, and more.
DataLearner provides in-depth analysis on top of the raw data, linking leaderboard models to the DataLearner model database so you can quickly access model details, API pricing, benchmark scores, and more.
Ranking Table
| Rank | Model | Score | 95% CI | Votes | Organization | License |
|---|---|---|---|---|---|---|
| Seedance 2.0字节跳动Seed团队 | 1,457 | +/-9 | 22,185 | 字节跳动Seed团队 | Proprietary | |
happyhorse-1.0Alibaba-ATH | 1,435 | +/-9 | 6,266 | Alibaba-ATH | Proprietary | |
Veo 3.1 Generate (Preview)Google Deep Mind | 1,372 | +/-11 | 13,978 | Google Deep Mind | Proprietary | |
| 4 | Sora 2OpenAI | 1,368 | +/-8 | 33,475 | OpenAI | Proprietary |
| 5 | Veo 3.1 Generate (Preview)Google Deep Mind | 1,366 | +/-14 | 13,689 | Google Deep Mind | Proprietary |
| 6 | Veo 3.1 Fast (Preview)Google Deep Mind | 1,364 | +/-11 | 39,325 | Google Deep Mind | Proprietary |
| 7 | Veo 3.1 Fast (Preview)Google Deep Mind | 1,364 | +/-11 | 14,089 | Google Deep Mind | Proprietary |
| 8 | 1,357 | +/-7 | 108,058 | xAI | Proprietary | |
| 9 | Veo 3.1 Fast (Preview)Google Deep Mind | 1,349 | +/-11 | 25,154 | Google Deep Mind | Proprietary |
| 10 | Wan2.6 T2VAlibaba | 1,341 | +/-11 | 24,738 | Alibaba | Proprietary |
| 11 | Veo 3.1 Generate (Preview)Google Deep Mind | 1,341 | +/-12 | 18,966 | Google Deep Mind | Proprietary |
| 12 | Sora 2OpenAI | 1,339 | +/-7 | 44,913 | OpenAI | Proprietary |
| 13 | Wan2.1-T2V-14B阿里巴巴 | 1,260 | +/-13 | 13,064 | 阿里巴巴 | Proprietary |
| 14 | Seedance 2.0字节跳动Seed团队 | 1,258 | +/-7 | 60,453 | 字节跳动Seed团队 | Proprietary |
| 15 | Veo 3.1 Generate (Preview)Google Deep Mind | 1,254 | +/-11 | 14,949 | Google Deep Mind | Proprietary |
| 16 | Veo 3.1 Fast (Preview)Google Deep Mind | 1,250 | +/-12 | 15,230 | Google Deep Mind | Proprietary |
| 17 | Pixverse v5.6Pixverse | 1,238 | +/-9 | 20,975 | Pixverse | Proprietary |
| 18 | Runway Gen-4.5Runway | 1,235 | +/-12 | 20,839 | Runway | Proprietary |
| 19 | Kling 2.5 Turbo昆仑万维 | 1,221 | +/-17 | 2,104 | 昆仑万维 | Proprietary |
| 20 | Kling 2.5 Turbo昆仑万维 | 1,219 | +/-7 | 60,034 | 昆仑万维 | Proprietary |
| 21 | p-videoPruna | 1,209 | +/-16 | 7,041 | Pruna | Proprietary |
| 22 | Ray 3Luma AI | 1,207 | +/-22 | 1,121 | Luma AI | Proprietary |
| 23 | Kling 2.5 Turbo昆仑万维 | 1,207 | +/-27 | 1,193 | 昆仑万维 | Proprietary |
| 24 | 1,199 | +/-12 | 9,370 | MiniMaxAI | Proprietary | |
| 25 | 1,199 | +/-7 | 50,014 | MiniMaxAI | Proprietary | |
| 26 | Seedance 2.0字节跳动Seed团队 | 1,192 | +/-11 | 12,122 | 字节跳动Seed团队 | Proprietary |
| 27 | 1,181 | +/-12 | 9,333 | MiniMaxAI | Proprietary | |
| 28 | Kandinsky 5.0 T2V ProKandinsky | 1,176 | +/-21 | 2,020 | Kandinsky | MIT |
| 29 | Hunyuan-A13B-Instruct腾讯AI实验室 | 1,170 | +/-16 | 4,273 | 腾讯AI实验室 | tencent-hunyuan-community |
| 30 | Veo 3.1 Generate (Preview)Google Deep Mind | 1,164 | +/-16 | 6,509 | Google Deep Mind | Proprietary |
| 31 | Kling 2.5 Turbo昆仑万维 | 1,164 | +/-9 | 14,049 | 昆仑万维 | Proprietary |
| 32 | LTX 2 19Blightricks | 1,135 | +/-9 | 42,742 | lightricks | ltx-2-community-license-agreement |
| 33 | Wan2.1-T2V-14B阿里巴巴 | 1,133 | +/-15 | 10,419 | 阿里巴巴 | Apache 2.0 |
| 34 | Kandinsky 5.0 T2V LiteKandinsky | 1,115 | +/-18 | 1,475 | Kandinsky | MIT |
| 35 | Seedance 2.0字节跳动Seed团队 | 1,114 | +/-9 | 16,214 | 字节跳动Seed团队 | Proprietary |
| 36 | soraOpenAI | 1,070 | +/-16 | 4,080 | OpenAI | Proprietary |
| 37 | Ray 2Luma AI | 1,066 | +/-17 | 5,217 | Luma AI | Proprietary |
| 38 | Pika v2.2Pika | 1,009 | +/-15 | 5,728 | Pika | Proprietary |
| 39 | Mochi v1Genmo AI | 1,007 | +/-16 | 5,862 | Genmo AI | Apache 2.0 |
Data is for reference only. Official sources are authoritative. Click model names to view DataLearner model profiles.
2026-05 Market Signals
Current Best (SOTA)
Veo 3.1 Audio 1080p
Veo 3.1 Fast-Audio 1080p
Sora-2-Pro
Best China Model
Wan2.6-T2V
Seedance-V1.5-Pro
Kling-2.6-Pro
Best Open Model
- •Wan-V2.2-A14B
- •Kandinsky-5.0-T2V-Pro
- •Mochi-V1
FAQ
How does Text-to-Video Arena rank models?
Rankings are based on side-by-side anonymous votes. Users enter the same prompt, compare outputs from two hidden models, and choose the better video. Elo-style scoring then aggregates those comparisons into a leaderboard.
What is audio-video sync, and why does it matter?
Audio-video sync means generated sound effects or speech match the motion and timing in the video. It matters because synchronized audio can make generated clips usable with less post-production work.
What use cases are text-to-video models good for?
Common uses include short-form video creation, marketing assets, e-commerce product clips, storyboarding, game cinematics, and educational demos.
Which models support the longest generation length?
Long generation limits change quickly by product tier and release. In practice, check the current model documentation and compare both maximum duration and quality consistency across longer clips.


