影像導向器:互動式影片合成的精準控制
Image Conductor: Precision Control for Interactive Video Synthesis
June 21, 2024
作者: Yaowei Li, Xintao Wang, Zhaoyang Zhang, Zhouxia Wang, Ziyang Yuan, Liangbin Xie, Yuexian Zou, Ying Shan
cs.AI
摘要
電影製作和動畫製作通常需要複雜的技術來協調攝影機轉換和物體移動,通常涉及勞動密集型的現實世界捕捉。儘管在生成式人工智慧的視頻創建方面取得了進展,但實現對互動視頻資產生成的運動進行精確控制仍然具有挑戰性。為此,我們提出了「影像導向器」,這是一種從單張圖像生成視頻資產的方法,用於精確控制攝影機轉換和物體移動。我們提出了一種經過良好培養的訓練策略,通過攝影機 LoRA 權重和物體 LoRA 權重來區分不同的攝影機和物體運動。為了進一步應對來自不明確軌跡的電影變化,我們在推斷過程中引入了一種無攝影機指導技術,增強物體移動同時消除攝影機轉換。此外,我們開發了一個以軌跡為導向的視頻運動數據策劃流程進行訓練。定量和定性實驗展示了我們的方法在從圖像生成可控運動的視頻方面的精確性和細粒度控制,推動了互動視頻合成的實際應用。項目網頁位於 https://liyaowei-stu.github.io/project/ImageConductor/
English
Filmmaking and animation production often require sophisticated techniques
for coordinating camera transitions and object movements, typically involving
labor-intensive real-world capturing. Despite advancements in generative AI for
video creation, achieving precise control over motion for interactive video
asset generation remains challenging. To this end, we propose Image Conductor,
a method for precise control of camera transitions and object movements to
generate video assets from a single image. An well-cultivated training strategy
is proposed to separate distinct camera and object motion by camera LoRA
weights and object LoRA weights. To further address cinematographic variations
from ill-posed trajectories, we introduce a camera-free guidance technique
during inference, enhancing object movements while eliminating camera
transitions. Additionally, we develop a trajectory-oriented video motion data
curation pipeline for training. Quantitative and qualitative experiments
demonstrate our method's precision and fine-grained control in generating
motion-controllable videos from images, advancing the practical application of
interactive video synthesis. Project webpage available at
https://liyaowei-stu.github.io/project/ImageConductor/Summary
AI-Generated Summary