MoVieS:一秒实现运动感知的四维动态视角合成
MoVieS: Motion-Aware 4D Dynamic View Synthesis in One Second
July 14, 2025
作者: Chenguo Lin, Yuchen Lin, Panwang Pan, Yifan Yu, Honglei Yan, Katerina Fragkiadaki, Yadong Mu
cs.AI
摘要
我们提出了MoVieS,一种新颖的前馈模型,能够在一秒内从单目视频中合成4D动态新视角。MoVieS采用像素对齐的高斯基元网格来表示动态3D场景,并显式监督其随时间变化的运动。这首次实现了外观、几何与运动的统一建模,并在单一学习框架内支持视角合成、重建及3D点追踪。通过将新视角合成与动态几何重建相结合,MoVieS能够在多样数据集上进行大规模训练,且对任务特定监督的依赖降至最低。因此,它自然支持多种零样本应用,如场景流估计和移动物体分割。大量实验验证了MoVieS在多项任务中的有效性和效率,不仅取得了竞争性的性能,还实现了数量级的速度提升。
English
We present MoVieS, a novel feed-forward model that synthesizes 4D dynamic
novel views from monocular videos in one second. MoVieS represents dynamic 3D
scenes using pixel-aligned grids of Gaussian primitives, explicitly supervising
their time-varying motion. This allows, for the first time, the unified
modeling of appearance, geometry and motion, and enables view synthesis,
reconstruction and 3D point tracking within a single learning-based framework.
By bridging novel view synthesis with dynamic geometry reconstruction, MoVieS
enables large-scale training on diverse datasets with minimal dependence on
task-specific supervision. As a result, it also naturally supports a wide range
of zero-shot applications, such as scene flow estimation and moving object
segmentation. Extensive experiments validate the effectiveness and efficiency
of MoVieS across multiple tasks, achieving competitive performance while
offering several orders of magnitude speedups.