Open-Sora 2.0:以20萬美元成本訓練商業級視頻生成模型
Open-Sora 2.0: Training a Commercial-Level Video Generation Model in $200k
March 12, 2025
作者: Xiangyu Peng, Zangwei Zheng, Chenhui Shen, Tom Young, Xinying Guo, Binluo Wang, Hang Xu, Hongxin Liu, Mingyan Jiang, Wenjun Li, Yuhui Wang, Anbang Ye, Gang Ren, Qianran Ma, Wanying Liang, Xiang Lian, Xiwen Wu, Yuting Zhong, Zhuangyan Li, Chaoyu Gong, Guojun Lei, Leijun Cheng, Limin Zhang, Minghao Li, Ruijie Zhang, Silan Hu, Shijie Huang, Xiaokang Wang, Yuanheng Zhao, Yuqi Wang, Ziang Wei, Yang You
cs.AI
摘要
過去一年中,視頻生成模型取得了顯著進展。AI視頻的質量持續提升,但代價是模型規模更大、數據量增加以及對訓練計算資源的需求更高。在本報告中,我們介紹了Open-Sora 2.0,這是一個僅耗資20萬美元訓練的商業級視頻生成模型。通過該模型,我們展示了訓練頂尖視頻生成模型的成本是高度可控的。我們詳細介紹了所有促成這一效率突破的技術,包括數據策劃、模型架構、訓練策略和系統優化。根據人類評估結果和VBench評分,Open-Sora 2.0可與全球領先的視頻生成模型相媲美,包括開源的HunyuanVideo和閉源的Runway Gen-3 Alpha。通過將Open-Sora 2.0完全開源,我們旨在普及先進視頻生成技術的訪問,促進內容創作領域更廣泛的創新和創造力。所有資源均公開於:https://github.com/hpcaitech/Open-Sora。
English
Video generation models have achieved remarkable progress in the past year.
The quality of AI video continues to improve, but at the cost of larger model
size, increased data quantity, and greater demand for training compute. In this
report, we present Open-Sora 2.0, a commercial-level video generation model
trained for only $200k. With this model, we demonstrate that the cost of
training a top-performing video generation model is highly controllable. We
detail all techniques that contribute to this efficiency breakthrough,
including data curation, model architecture, training strategy, and system
optimization. According to human evaluation results and VBench scores,
Open-Sora 2.0 is comparable to global leading video generation models including
the open-source HunyuanVideo and the closed-source Runway Gen-3 Alpha. By
making Open-Sora 2.0 fully open-source, we aim to democratize access to
advanced video generation technology, fostering broader innovation and
creativity in content creation. All resources are publicly available at:
https://github.com/hpcaitech/Open-Sora.Summary
AI-Generated Summary