ChatPaper.aiChatPaper

运动3转4:面向四维合成的三维运动重建

Motion 3-to-4: 3D Motion Reconstruction for 4D Synthesis

January 20, 2026
作者: Hongyuan Chen, Xingyu Chen, Youjia Zhang, Zexiang Xu, Anpei Chen
cs.AI

摘要

我们提出Motion 3-to-4框架——一种基于单目视频与可选三维参考网格生成高质量四维动态物体的前馈式系统。尽管二维、视频及三维内容生成技术近期取得显著进展,但由于训练数据有限以及单目视角下几何结构与运动重建的固有歧义性,四维合成仍面临挑战。该框架通过将四维合成解耦为静态三维形状生成与运动重建来应对这些难题:基于规范参考网格,模型学习紧凑的运动潜空间表示,并通过逐帧顶点轨迹预测实现完整且时序连贯的几何重建。可扩展的帧间变换器进一步增强了模型对可变序列长度的适应能力。在标准基准与包含精确真实几何的新数据集上的实验表明,Motion 3-to-4相比现有方法具有更优的保真度与空间一致性。项目页面详见https://motion3-to-4.github.io/。
English
We present Motion 3-to-4, a feed-forward framework for synthesising high-quality 4D dynamic objects from a single monocular video and an optional 3D reference mesh. While recent advances have significantly improved 2D, video, and 3D content generation, 4D synthesis remains difficult due to limited training data and the inherent ambiguity of recovering geometry and motion from a monocular viewpoint. Motion 3-to-4 addresses these challenges by decomposing 4D synthesis into static 3D shape generation and motion reconstruction. Using a canonical reference mesh, our model learns a compact motion latent representation and predicts per-frame vertex trajectories to recover complete, temporally coherent geometry. A scalable frame-wise transformer further enables robustness to varying sequence lengths. Evaluations on both standard benchmarks and a new dataset with accurate ground-truth geometry show that Motion 3-to-4 delivers superior fidelity and spatial consistency compared to prior work. Project page is available at https://motion3-to-4.github.io/.
PDF01January 23, 2026