时序正则化让您的视频生成器更强大
Temporal Regularization Makes Your Video Generator Stronger
March 19, 2025
作者: Harold Haodong Chen, Haojian Huang, Xianfeng Wu, Yexin Liu, Yajing Bai, Wen-Jie Shu, Harry Yang, Ser-Nam Lim
cs.AI
摘要
时间质量是视频生成的关键要素,它确保了帧间运动的一致性和动态的真实性。然而,实现高度的时间连贯性与多样性仍具挑战性。本研究中,我们首次探索了视频生成中的时间增强技术,并引入了FluxFlow作为初步研究策略,旨在提升时间质量。FluxFlow在数据层面操作,通过施加可控的时间扰动,无需修改模型架构。在UCF-101和VBench基准上的大量实验表明,FluxFlow显著提升了包括U-Net、DiT及基于自回归架构在内的多种视频生成模型的时间连贯性与多样性,同时保持了空间保真度。这些发现凸显了时间增强作为一种简单而有效的方法,在推动视频生成质量提升方面的潜力。
English
Temporal quality is a critical aspect of video generation, as it ensures
consistent motion and realistic dynamics across frames. However, achieving high
temporal coherence and diversity remains challenging. In this work, we explore
temporal augmentation in video generation for the first time, and introduce
FluxFlow for initial investigation, a strategy designed to enhance temporal
quality. Operating at the data level, FluxFlow applies controlled temporal
perturbations without requiring architectural modifications. Extensive
experiments on UCF-101 and VBench benchmarks demonstrate that FluxFlow
significantly improves temporal coherence and diversity across various video
generation models, including U-Net, DiT, and AR-based architectures, while
preserving spatial fidelity. These findings highlight the potential of temporal
augmentation as a simple yet effective approach to advancing video generation
quality.Summary
AI-Generated Summary