生成式视点拼接
Generative View Stitching
October 28, 2025
作者: Chonghyuk Song, Michal Stary, Boyuan Chen, George Kopanas, Vincent Sitzmann
cs.AI
摘要
自回归视频扩散模型能够生成稳定且历史一致的长序列,但其无法通过未来条件信息指导当前帧生成。在基于预设相机轨迹的摄像引导视频生成中,这一缺陷会导致生成场景与轨迹发生碰撞,进而引发自回归过程的快速崩溃。为此,我们提出生成式视点缝合技术(GVS),通过并行采样整个序列确保生成场景忠实遵循预设相机轨迹的每个部分。我们的核心贡献是一种采样算法,将机器人规划领域的扩散缝合技术拓展至视频生成领域。此类缝合方法通常需专门训练的模型,而GVS兼容任何采用"扩散驱动"训练的现成视频模型——我们证明这一主流序列扩散框架已具备缝合所需的支持能力。我们还提出全向引导技术,通过联合过去与未来条件增强缝合时序一致性,并实现闭环机制以保障长程连贯性。总体而言,GVS实现的摄像引导视频生成具有稳定性、无碰撞性、帧间一致性,并能对包括奥斯卡·路特斯瓦德"不可能阶梯"在内的多种预设相机路径实现闭环生成。视频效果请参阅https://andrewsonga.github.io/gvs。
English
Autoregressive video diffusion models are capable of long rollouts that are
stable and consistent with history, but they are unable to guide the current
generation with conditioning from the future. In camera-guided video generation
with a predefined camera trajectory, this limitation leads to collisions with
the generated scene, after which autoregression quickly collapses. To address
this, we propose Generative View Stitching (GVS), which samples the entire
sequence in parallel such that the generated scene is faithful to every part of
the predefined camera trajectory. Our main contribution is a sampling algorithm
that extends prior work on diffusion stitching for robot planning to video
generation. While such stitching methods usually require a specially trained
model, GVS is compatible with any off-the-shelf video model trained with
Diffusion Forcing, a prevalent sequence diffusion framework that we show
already provides the affordances necessary for stitching. We then introduce
Omni Guidance, a technique that enhances the temporal consistency in stitching
by conditioning on both the past and future, and that enables our proposed
loop-closing mechanism for delivering long-range coherence. Overall, GVS
achieves camera-guided video generation that is stable, collision-free,
frame-to-frame consistent, and closes loops for a variety of predefined camera
paths, including Oscar Reutersv\"ard's Impossible Staircase. Results are best
viewed as videos at https://andrewsonga.github.io/gvs.