ChatPaper.aiChatPaper

MagicMotion:基於密集至稀疏軌跡引導的可控視頻生成

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

March 20, 2025
作者: Quanhao Li, Zhen Xing, Rui Wang, Hui Zhang, Qi Dai, Zuxuan Wu
cs.AI

摘要

近期,視頻生成技術的進步顯著提升了視覺品質與時間連貫性。在此基礎上,軌跡可控的視頻生成技術應運而生,它通過明確定義的空間路徑實現了對物體運動的精確控制。然而,現有方法在處理複雜物體運動及多物體運動控制時仍顯不足,導致軌跡跟隨不精確、物體一致性差以及視覺品質受損。此外,這些方法僅支持單一格式的軌跡控制,限制了其在多樣化場景中的應用。更為關鍵的是,目前尚無專門針對軌跡可控視頻生成公開的數據集或基準測試,這阻礙了模型的穩健訓練與系統化評估。為應對這些挑戰,我們推出了MagicMotion,這是一種新穎的圖像到視頻生成框架,它通過從密集到稀疏的三層條件——遮罩、邊界框和稀疏框——來實現軌跡控制。給定輸入圖像及軌跡,MagicMotion能夠無縫地沿著定義的軌跡動畫化物體,同時保持物體的一致性和視覺品質。此外,我們還推出了MagicData,這是一個大規模的軌跡控制視頻數據集,並配備了自動化的註釋與過濾流程。同時,我們引入了MagicBench,這是一個全面的基準測試,用於評估不同數量物體下的視頻品質與軌跡控制精度。大量實驗證明,MagicMotion在多項指標上均優於先前的方法。我們的項目頁面已公開,網址為https://quanhaol.github.io/magicmotion-site。
English
Recent advances in video generation have led to remarkable improvements in visual quality and temporal coherence. Upon this, trajectory-controllable video generation has emerged to enable precise object motion control through explicitly defined spatial paths. However, existing methods struggle with complex object movements and multi-object motion control, resulting in imprecise trajectory adherence, poor object consistency, and compromised visual quality. Furthermore, these methods only support trajectory control in a single format, limiting their applicability in diverse scenarios. Additionally, there is no publicly available dataset or benchmark specifically tailored for trajectory-controllable video generation, hindering robust training and systematic evaluation. To address these challenges, we introduce MagicMotion, a novel image-to-video generation framework that enables trajectory control through three levels of conditions from dense to sparse: masks, bounding boxes, and sparse boxes. Given an input image and trajectories, MagicMotion seamlessly animates objects along defined trajectories while maintaining object consistency and visual quality. Furthermore, we present MagicData, a large-scale trajectory-controlled video dataset, along with an automated pipeline for annotation and filtering. We also introduce MagicBench, a comprehensive benchmark that assesses both video quality and trajectory control accuracy across different numbers of objects. Extensive experiments demonstrate that MagicMotion outperforms previous methods across various metrics. Our project page are publicly available at https://quanhaol.github.io/magicmotion-site.

Summary

AI-Generated Summary

PDF92March 21, 2025