操纵者:将交互式视频生成扩展为运动先验,用于部件级动态
Puppet-Master: Scaling Interactive Video Generation as a Motion Prior for Part-Level Dynamics
August 8, 2024
作者: Ruining Li, Chuanxia Zheng, Christian Rupprecht, Andrea Vedaldi
cs.AI
摘要
我们提出了Puppet-Master,这是一种交互式视频生成模型,可作为部分层面动态的运动先验。在测试时,给定一张单独的图像和一组稀疏的运动轨迹(即拖动),Puppet-Master可以合成一段视频,展现出忠实于给定拖动交互的逼真部分层面运动。这是通过对一个大规模预训练的视频扩散模型进行微调实现的,我们提出了一种新的调节架构,以有效注入拖动控制。更重要的是,我们引入了全局到局部注意力机制,这是对广泛采用的空间注意力模块的一种即插即用替代方案,通过解决现有模型中的外观和背景问题,显著提高了生成质量。与其他在野外视频上训练并主要移动整个物体的运动条件视频生成器不同,Puppet-Master是从Objaverse-Animation-HQ学习的,这是一个经过筛选的部分层面运动剪辑新数据集。我们提出了一种策略,可以自动过滤出次优动画,并用有意义的运动轨迹增强合成渲染。Puppet-Master在各种类别的真实图像上具有良好的泛化性能,并在真实世界基准测试中以零样本方式胜过现有方法。请查看我们的项目页面获取更多结果:vgg-puppetmaster.github.io。
English
We present Puppet-Master, an interactive video generative model that can
serve as a motion prior for part-level dynamics. At test time, given a single
image and a sparse set of motion trajectories (i.e., drags), Puppet-Master can
synthesize a video depicting realistic part-level motion faithful to the given
drag interactions. This is achieved by fine-tuning a large-scale pre-trained
video diffusion model, for which we propose a new conditioning architecture to
inject the dragging control effectively. More importantly, we introduce the
all-to-first attention mechanism, a drop-in replacement for the widely adopted
spatial attention modules, which significantly improves generation quality by
addressing the appearance and background issues in existing models. Unlike
other motion-conditioned video generators that are trained on in-the-wild
videos and mostly move an entire object, Puppet-Master is learned from
Objaverse-Animation-HQ, a new dataset of curated part-level motion clips. We
propose a strategy to automatically filter out sub-optimal animations and
augment the synthetic renderings with meaningful motion trajectories.
Puppet-Master generalizes well to real images across various categories and
outperforms existing methods in a zero-shot manner on a real-world benchmark.
See our project page for more results: vgg-puppetmaster.github.io.Summary
AI-Generated Summary