ChatPaper.aiChatPaper

OmniRoam:基于长视角全景视频生成的世界漫游技术

OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

March 31, 2026
作者: Yuheng Liu, Xin Lin, Xinke Li, Baihan Yang, Chen Wang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Hao Tan, Kai Zhang, Xiaohui Xie, Zifan Shi, Yiwei Hu
cs.AI

摘要

近年来,基于视频生成模型的场景建模研究日益受到关注。然而现有方法大多依赖透视视频模型,仅能合成场景的有限观测视角,导致完整性与全局一致性不足。我们提出OmniRoam——一种可控全景视频生成框架,该框架利用全景表征所具备的每帧场景覆盖范围广、固有时空一致性强的特性,实现长时序场景漫游。该框架首先通过预览阶段,由轨迹控制视频生成模型根据输入图像或视频快速生成场景概览;随后在优化阶段对该视频进行时序扩展与空间超分,生成长时程高分辨率视频,从而实现高保真世界漫游。为训练模型,我们构建了包含合成视频与实拍视频的两大全景视频数据集。实验表明,无论是在视觉质量、可控性还是长时场景一致性方面,本框架均持续优于现有先进方法,定性与定量分析结果均验证其优越性。我们进一步展示了该框架的多种扩展应用,包括实时视频生成与三维重建。代码已开源:https://github.com/yuhengliu02/OmniRoam。
English
Modeling scenes using video generation models has garnered growing research interest in recent years. However, most existing approaches rely on perspective video models that synthesize only limited observations of a scene, leading to issues of completeness and global consistency. We propose OmniRoam, a controllable panoramic video generation framework that exploits the rich per-frame scene coverage and inherent long-term spatial and temporal consistency of panoramic representation, enabling long-horizon scene wandering. Our framework begins with a preview stage, where a trajectory-controlled video generation model creates a quick overview of the scene from a given input image or video. Then, in the refine stage, this video is temporally extended and spatially upsampled to produce long-range, high-resolution videos, thus enabling high-fidelity world wandering. To train our model, we introduce two panoramic video datasets that incorporate both synthetic and real-world captured videos. Experiments show that our framework consistently outperforms state-of-the-art methods in terms of visual quality, controllability, and long-term scene consistency, both qualitatively and quantitatively. We further showcase several extensions of this framework, including real-time video generation and 3D reconstruction. Code is available at https://github.com/yuhengliu02/OmniRoam.
PDF01April 2, 2026