ChatPaper.aiChatPaper

MagicMotion:基于稠密至稀疏轨迹引导的可控视频生成

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

March 20, 2025
作者: Quanhao Li, Zhen Xing, Rui Wang, Hui Zhang, Qi Dai, Zuxuan Wu
cs.AI

摘要

近期视频生成技术的进步显著提升了视觉质量和时间连贯性。在此基础上,轨迹可控视频生成应运而生,通过明确界定的空间路径实现对物体运动的精确控制。然而,现有方法在处理复杂物体运动及多物体运动控制时面临挑战,导致轨迹跟随不精确、物体一致性差以及视觉质量受损。此外,这些方法仅支持单一格式的轨迹控制,限制了其在多样化场景中的应用。同时,缺乏专门针对轨迹可控视频生成的公开数据集或基准测试,阻碍了模型的稳健训练与系统评估。为解决这些问题,我们推出了MagicMotion,一种新颖的图像到视频生成框架,它通过从密集到稀疏的三个条件层级——掩码、边界框和稀疏框——实现轨迹控制。给定输入图像及轨迹,MagicMotion能够无缝地沿定义轨迹动画化物体,同时保持物体一致性和视觉质量。此外,我们发布了MagicData,一个大规模轨迹控制视频数据集,并配套了自动化标注与过滤流程。我们还引入了MagicBench,一个全面评估不同数量物体下视频质量与轨迹控制准确性的基准测试。大量实验证明,MagicMotion在多项指标上均优于先前方法。我们的项目页面已公开,访问地址为https://quanhaol.github.io/magicmotion-site。
English
Recent advances in video generation have led to remarkable improvements in visual quality and temporal coherence. Upon this, trajectory-controllable video generation has emerged to enable precise object motion control through explicitly defined spatial paths. However, existing methods struggle with complex object movements and multi-object motion control, resulting in imprecise trajectory adherence, poor object consistency, and compromised visual quality. Furthermore, these methods only support trajectory control in a single format, limiting their applicability in diverse scenarios. Additionally, there is no publicly available dataset or benchmark specifically tailored for trajectory-controllable video generation, hindering robust training and systematic evaluation. To address these challenges, we introduce MagicMotion, a novel image-to-video generation framework that enables trajectory control through three levels of conditions from dense to sparse: masks, bounding boxes, and sparse boxes. Given an input image and trajectories, MagicMotion seamlessly animates objects along defined trajectories while maintaining object consistency and visual quality. Furthermore, we present MagicData, a large-scale trajectory-controlled video dataset, along with an automated pipeline for annotation and filtering. We also introduce MagicBench, a comprehensive benchmark that assesses both video quality and trajectory control accuracy across different numbers of objects. Extensive experiments demonstrate that MagicMotion outperforms previous methods across various metrics. Our project page are publicly available at https://quanhaol.github.io/magicmotion-site.

Summary

AI-Generated Summary

PDF92March 21, 2025