ChatPaper.aiChatPaper

ReVideo:重新制作具有运动和内容控制的视频

ReVideo: Remake a Video with Motion and Content Control

May 22, 2024
作者: Chong Mou, Mingdeng Cao, Xintao Wang, Zhaoyang Zhang, Ying Shan, Jian Zhang
cs.AI

摘要

尽管扩散模型在视频生成和编辑方面取得了显著进展,但实现准确和局部化的视频编辑仍然是一个重大挑战。此外,大多数现有的视频编辑方法主要集中在改变视觉内容,对运动编辑的研究有限。在本文中,我们提出了一种新的尝试,即重新制作视频(ReVideo),与现有方法有所不同,它允许通过指定内容和运动来精确编辑特定区域的视频。通过修改第一帧来实现内容编辑,而基于轨迹的运动控制提供了直观的用户交互体验。ReVideo解决了涉及内容和运动控制之间耦合和训练不平衡的新任务。为了解决这个问题,我们制定了一个三阶段训练策略,逐渐从粗到细地解耦这两个方面。此外,我们提出了一个时空自适应融合模块,以在各种采样步骤和空间位置上整合内容和运动控制。大量实验证明,我们的ReVideo在几个准确的视频编辑应用上表现出色,即(1)在保持运动恒定的情况下局部更改视频内容,(2)保持内容不变并自定义新的运动轨迹,(3)修改内容和运动轨迹。我们的方法还可以无缝地将这些应用扩展到多区域编辑,无需特定训练,展示了其灵活性和稳健性。
English
Despite significant advancements in video generation and editing using diffusion models, achieving accurate and localized video editing remains a substantial challenge. Additionally, most existing video editing methods primarily focus on altering visual content, with limited research dedicated to motion editing. In this paper, we present a novel attempt to Remake a Video (ReVideo) which stands out from existing methods by allowing precise video editing in specific areas through the specification of both content and motion. Content editing is facilitated by modifying the first frame, while the trajectory-based motion control offers an intuitive user interaction experience. ReVideo addresses a new task involving the coupling and training imbalance between content and motion control. To tackle this, we develop a three-stage training strategy that progressively decouples these two aspects from coarse to fine. Furthermore, we propose a spatiotemporal adaptive fusion module to integrate content and motion control across various sampling steps and spatial locations. Extensive experiments demonstrate that our ReVideo has promising performance on several accurate video editing applications, i.e., (1) locally changing video content while keeping the motion constant, (2) keeping content unchanged and customizing new motion trajectories, (3) modifying both content and motion trajectories. Our method can also seamlessly extend these applications to multi-area editing without specific training, demonstrating its flexibility and robustness.

Summary

AI-Generated Summary

PDF265December 15, 2024