Enhance-A-Video：免費提升生成影片品質

摘要

基於 DiT 的影片生成已取得顯著成果，但對於增強現有模型的研究仍相對未被探索。在這項研究中，我們提出了一種無需訓練的方法，用於增強 DiT 基礎生成的影片的連貫性和質量，稱為 Enhance-A-Video。其核心思想是基於非對角時間注意力分佈來增強跨幀之間的相關性。由於其簡單的設計，我們的方法可以輕鬆應用於大多數基於 DiT 的影片生成框架，無需重新訓練或微調。在各種基於 DiT 的影片生成模型中，我們的方法展示了在時間一致性和視覺質量方面的顯著改進。我們希望這項研究能激發未來在影片生成增強方面的探索。

English

DiT-based video generation has achieved remarkable results, but research into enhancing existing models remains relatively unexplored. In this work, we introduce a training-free approach to enhance the coherence and quality of DiT-based generated videos, named Enhance-A-Video. The core idea is enhancing the cross-frame correlations based on non-diagonal temporal attention distributions. Thanks to its simple design, our approach can be easily applied to most DiT-based video generation frameworks without any retraining or fine-tuning. Across various DiT-based video generation models, our approach demonstrates promising improvements in both temporal consistency and visual quality. We hope this research can inspire future explorations in video generation enhancement.

Enhance-A-Video：免費提升生成影片品質

Enhance-A-Video: Better Generated Video for Free

摘要

Support