MagicMotion:基于稠密至稀疏轨迹引导的可控视频生成
MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance
March 20, 2025
作者: Quanhao Li, Zhen Xing, Rui Wang, Hui Zhang, Qi Dai, Zuxuan Wu
cs.AI
摘要
近期视频生成技术的进步显著提升了视觉质量和时间连贯性。在此基础上,轨迹可控视频生成应运而生,通过明确界定的空间路径实现对物体运动的精确控制。然而,现有方法在处理复杂物体运动及多物体运动控制时面临挑战,导致轨迹跟随不精确、物体一致性差以及视觉质量受损。此外,这些方法仅支持单一格式的轨迹控制,限制了其在多样化场景中的应用。同时,缺乏专门针对轨迹可控视频生成的公开数据集或基准测试,阻碍了模型的稳健训练与系统评估。为解决这些问题,我们推出了MagicMotion,一种新颖的图像到视频生成框架,它通过从密集到稀疏的三个条件层级——掩码、边界框和稀疏框——实现轨迹控制。给定输入图像及轨迹,MagicMotion能够无缝地沿定义轨迹动画化物体,同时保持物体一致性和视觉质量。此外,我们发布了MagicData,一个大规模轨迹控制视频数据集,并配套了自动化标注与过滤流程。我们还引入了MagicBench,一个全面评估不同数量物体下视频质量与轨迹控制准确性的基准测试。大量实验证明,MagicMotion在多项指标上均优于先前方法。我们的项目页面已公开,访问地址为https://quanhaol.github.io/magicmotion-site。
English
Recent advances in video generation have led to remarkable improvements in
visual quality and temporal coherence. Upon this, trajectory-controllable video
generation has emerged to enable precise object motion control through
explicitly defined spatial paths. However, existing methods struggle with
complex object movements and multi-object motion control, resulting in
imprecise trajectory adherence, poor object consistency, and compromised visual
quality. Furthermore, these methods only support trajectory control in a single
format, limiting their applicability in diverse scenarios. Additionally, there
is no publicly available dataset or benchmark specifically tailored for
trajectory-controllable video generation, hindering robust training and
systematic evaluation. To address these challenges, we introduce MagicMotion, a
novel image-to-video generation framework that enables trajectory control
through three levels of conditions from dense to sparse: masks, bounding boxes,
and sparse boxes. Given an input image and trajectories, MagicMotion seamlessly
animates objects along defined trajectories while maintaining object
consistency and visual quality. Furthermore, we present MagicData, a
large-scale trajectory-controlled video dataset, along with an automated
pipeline for annotation and filtering. We also introduce MagicBench, a
comprehensive benchmark that assesses both video quality and trajectory control
accuracy across different numbers of objects. Extensive experiments demonstrate
that MagicMotion outperforms previous methods across various metrics. Our
project page are publicly available at
https://quanhaol.github.io/magicmotion-site.Summary
AI-Generated Summary