ChatPaper.aiChatPaper

時間正則化讓您的視頻生成器更強大

Temporal Regularization Makes Your Video Generator Stronger

March 19, 2025
作者: Harold Haodong Chen, Haojian Huang, Xianfeng Wu, Yexin Liu, Yajing Bai, Wen-Jie Shu, Harry Yang, Ser-Nam Lim
cs.AI

摘要

時間質量是視頻生成中的關鍵因素,它確保了幀間一致的運動和逼真的動態效果。然而,實現高時間一致性和多樣性仍然具有挑戰性。在本研究中,我們首次探索了視頻生成中的時間增強技術,並引入了FluxFlow進行初步研究,這是一種旨在提升時間質量的策略。FluxFlow在數據層面操作,通過施加受控的時間擾動,無需修改模型架構。在UCF-101和VBench基準上的大量實驗表明,FluxFlow顯著提升了包括U-Net、DiT和基於自回歸架構在內的各種視頻生成模型的時間一致性和多樣性,同時保持了空間保真度。這些發現凸顯了時間增強作為一種簡單而有效的方法,在提升視頻生成質量方面的潛力。
English
Temporal quality is a critical aspect of video generation, as it ensures consistent motion and realistic dynamics across frames. However, achieving high temporal coherence and diversity remains challenging. In this work, we explore temporal augmentation in video generation for the first time, and introduce FluxFlow for initial investigation, a strategy designed to enhance temporal quality. Operating at the data level, FluxFlow applies controlled temporal perturbations without requiring architectural modifications. Extensive experiments on UCF-101 and VBench benchmarks demonstrate that FluxFlow significantly improves temporal coherence and diversity across various video generation models, including U-Net, DiT, and AR-based architectures, while preserving spatial fidelity. These findings highlight the potential of temporal augmentation as a simple yet effective approach to advancing video generation quality.

Summary

AI-Generated Summary

PDF222March 20, 2025