ChatPaper.aiChatPaper

图像导体:交互式视频合成的精准控制

Image Conductor: Precision Control for Interactive Video Synthesis

June 21, 2024
作者: Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, Ying Shan
cs.AI

摘要

电影制作和动画制作通常需要复杂的技术来协调摄像机转换和物体移动,通常涉及劳动密集型的现实世界捕捉。尽管在视频创作方面取得了进展,但实现对交互式视频资产生成的运动的精确控制仍具有挑战性。为此,我们提出了图像导向器(Image Conductor),这是一种用于精确控制摄像机转换和物体移动以从单个图像生成视频资产的方法。我们提出了一种经过精心培养的训练策略,通过摄像机 LoRA 权重和物体 LoRA 权重来分离不同的摄像机和物体运动。为了进一步解决由于不适当的轨迹而产生的电影变化,我们在推断过程中引入了一种无摄像机指导技术,增强物体移动同时消除摄像机转换。此外,我们开发了一个以轨迹为导向的视频运动数据筛选管道用于训练。定量和定性实验展示了我们的方法在从图像生成可控运动视频方面的精度和细粒度控制,推动了交互式视频合成的实际应用。项目网页链接:https://liyaowei-stu.github.io/project/ImageConductor/
English
Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion by camera LoRA weights and object LoRA weights. To further address cinematographic variations from ill-posed trajectories, we introduce a camera-free guidance technique during inference, enhancing object movements while eliminating camera transitions. Additionally, we develop a trajectory-oriented video motion data curation pipeline for training. Quantitative and qualitative experiments demonstrate our method's precision and fine-grained control in generating motion-controllable videos from images, advancing the practical application of interactive video synthesis. Project webpage available at https://liyaowei-stu.github.io/project/ImageConductor/

Summary

AI-Generated Summary

PDF93November 29, 2024