VideoSwap:通过交互式语义点对应实现定制视频主体交换
VideoSwap: Customized Video Subject Swapping with Interactive Semantic Point Correspondence
December 4, 2023
作者: Yuchao Gu, Yipin Zhou, Bichen Wu, Licheng Yu, Jia-Wei Liu, Rui Zhao, Jay Zhangjie Wu, David Junhao Zhang, Mike Zheng Shou, Kevin Tang
cs.AI
摘要
当前基于扩散的视频编辑主要侧重于通过利用各种密集对应关系来实现结构保留编辑,以确保时间一致性和运动对齐。然而,当目标编辑涉及形状变化时,这些方法通常效果不佳。为了开始进行具有形状变化的视频编辑,我们在这项工作中探讨了定制视频主体交换,旨在将源视频中的主体替换为具有独特身份和可能不同形状的目标主体。与依赖密集对应关系的先前方法相比,我们引入了VideoSwap框架,该框架利用语义点对应关系,灵感来自我们的观察,即只有少量语义点是必要的,以对齐主体的运动轨迹并修改其形状。我们还引入了各种用户点交互(例如,删除点和拖动点)来处理各种语义点对应关系。大量实验证明,在各种真实世界视频中,我们的视频主体交换结果达到了最先进的水平。
English
Current diffusion-based video editing primarily focuses on
structure-preserved editing by utilizing various dense correspondences to
ensure temporal consistency and motion alignment. However, these approaches are
often ineffective when the target edit involves a shape change. To embark on
video editing with shape change, we explore customized video subject swapping
in this work, where we aim to replace the main subject in a source video with a
target subject having a distinct identity and potentially different shape. In
contrast to previous methods that rely on dense correspondences, we introduce
the VideoSwap framework that exploits semantic point correspondences, inspired
by our observation that only a small number of semantic points are necessary to
align the subject's motion trajectory and modify its shape. We also introduce
various user-point interactions (\eg, removing points and dragging points) to
address various semantic point correspondence. Extensive experiments
demonstrate state-of-the-art video subject swapping results across a variety of
real-world videos.