ATI:任意軌跡指令下的可控視頻生成
ATI: Any Trajectory Instruction for Controllable Video Generation
May 28, 2025
作者: Angtian Wang, Haibin Huang, Jacob Zhiyuan Fang, Yiding Yang, Chongyang Ma
cs.AI
摘要
我們提出了一個統一的框架,用於視頻生成中的運動控制,該框架無縫整合了基於軌跡輸入的相機移動、物體層面的平移以及細粒度的局部運動。與以往通過獨立模組或特定任務設計來處理這些運動類型的方法不同,我們的方法通過輕量級運動注入器將用戶定義的軌跡投影到預訓練的圖像到視頻生成模型的潛在空間中,從而提供了一個連貫的解決方案。用戶可以指定關鍵點及其運動路徑,以控制局部變形、整個物體的運動、虛擬相機動態或這些的組合。注入的軌跡信號引導生成過程,產生時間上一致且語義對齊的運動序列。我們的框架在多個視頻運動控制任務中展示了卓越的性能,包括風格化運動效果(例如,運動筆刷)、動態視角變化和精確的局部運動操控。實驗表明,與以往方法和商業解決方案相比,我們的方法提供了顯著更好的可控性和視覺質量,同時廣泛兼容於各種最先進的視頻生成骨幹。項目頁面:https://anytraj.github.io/。
English
We propose a unified framework for motion control in video generation that
seamlessly integrates camera movement, object-level translation, and
fine-grained local motion using trajectory-based inputs. In contrast to prior
methods that address these motion types through separate modules or
task-specific designs, our approach offers a cohesive solution by projecting
user-defined trajectories into the latent space of pre-trained image-to-video
generation models via a lightweight motion injector. Users can specify
keypoints and their motion paths to control localized deformations, entire
object motion, virtual camera dynamics, or combinations of these. The injected
trajectory signals guide the generative process to produce temporally
consistent and semantically aligned motion sequences. Our framework
demonstrates superior performance across multiple video motion control tasks,
including stylized motion effects (e.g., motion brushes), dynamic viewpoint
changes, and precise local motion manipulation. Experiments show that our
method provides significantly better controllability and visual quality
compared to prior approaches and commercial solutions, while remaining broadly
compatible with various state-of-the-art video generation backbones. Project
page: https://anytraj.github.io/.