ChatPaper.aiChatPaper

时序正则化让您的视频生成器更强大

Temporal Regularization Makes Your Video Generator Stronger

March 19, 2025
作者: Harold Haodong Chen, Haojian Huang, Xianfeng Wu, Yexin Liu, Yajing Bai, Wen-Jie Shu, Harry Yang, Ser-Nam Lim
cs.AI

摘要

时间质量是视频生成的关键要素,它确保了帧间运动的一致性和动态的真实性。然而,实现高度的时间连贯性与多样性仍具挑战性。本研究中,我们首次探索了视频生成中的时间增强技术,并引入了FluxFlow作为初步研究策略,旨在提升时间质量。FluxFlow在数据层面操作,通过施加可控的时间扰动,无需修改模型架构。在UCF-101和VBench基准上的大量实验表明,FluxFlow显著提升了包括U-Net、DiT及基于自回归架构在内的多种视频生成模型的时间连贯性与多样性,同时保持了空间保真度。这些发现凸显了时间增强作为一种简单而有效的方法,在推动视频生成质量提升方面的潜力。
English
Temporal quality is a critical aspect of video generation, as it ensures consistent motion and realistic dynamics across frames. However, achieving high temporal coherence and diversity remains challenging. In this work, we explore temporal augmentation in video generation for the first time, and introduce FluxFlow for initial investigation, a strategy designed to enhance temporal quality. Operating at the data level, FluxFlow applies controlled temporal perturbations without requiring architectural modifications. Extensive experiments on UCF-101 and VBench benchmarks demonstrate that FluxFlow significantly improves temporal coherence and diversity across various video generation models, including U-Net, DiT, and AR-based architectures, while preserving spatial fidelity. These findings highlight the potential of temporal augmentation as a simple yet effective approach to advancing video generation quality.

Summary

AI-Generated Summary

PDF222March 20, 2025