影像導向器：互動式影片合成的精準控制

摘要

電影製作和動畫製作通常需要複雜的技術來協調攝影機轉換和物體移動，通常涉及勞動密集型的現實世界捕捉。儘管在生成式人工智慧的視頻創建方面取得了進展，但實現對互動視頻資產生成的運動進行精確控制仍然具有挑戰性。為此，我們提出了「影像導向器」，這是一種從單張圖像生成視頻資產的方法，用於精確控制攝影機轉換和物體移動。我們提出了一種經過良好培養的訓練策略，通過攝影機 LoRA 權重和物體 LoRA 權重來區分不同的攝影機和物體運動。為了進一步應對來自不明確軌跡的電影變化，我們在推斷過程中引入了一種無攝影機指導技術，增強物體移動同時消除攝影機轉換。此外，我們開發了一個以軌跡為導向的視頻運動數據策劃流程進行訓練。定量和定性實驗展示了我們的方法在從圖像生成可控運動的視頻方面的精確性和細粒度控制，推動了互動視頻合成的實際應用。項目網頁位於 https://liyaowei-stu.github.io/project/ImageConductor/

English

Filmmaking and animation production often require sophisticated techniques for coordinating camera transitions and object movements, typically involving labor-intensive real-world capturing. Despite advancements in generative AI for video creation, achieving precise control over motion for interactive video asset generation remains challenging. To this end, we propose Image Conductor, a method for precise control of camera transitions and object movements to generate video assets from a single image. An well-cultivated training strategy is proposed to separate distinct camera and object motion by camera LoRA weights and object LoRA weights. To further address cinematographic variations from ill-posed trajectories, we introduce a camera-free guidance technique during inference, enhancing object movements while eliminating camera transitions. Additionally, we develop a trajectory-oriented video motion data curation pipeline for training. Quantitative and qualitative experiments demonstrate our method's precision and fine-grained control in generating motion-controllable videos from images, advancing the practical application of interactive video synthesis. Project webpage available at https://liyaowei-stu.github.io/project/ImageConductor/

影像導向器：互動式影片合成的精準控制

Image Conductor: Precision Control for Interactive Video Synthesis

摘要

Support