PanoWorld:面向一致全屋全景图合成的生成式空间世界模型
PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis
May 19, 2026
作者: Jinrang Jia, Zhenjia Li, Yijiang Hu, Yifeng Shi
cs.AI
摘要
从平面图和风格参考生成一致的全屋VR漫游,既需要逼真的全景图,也需要跨视角的空间一致性。纯2D生成器能生成吸引人的单张全景图,但当视角变化时会重新想象几何形状和材质,而单一的整体3D生成则成本高昂,并且在多房间尺度上会丢失精细纹理。我们提出了PanoWorld,一种生成式空间世界模型,它将全屋合成视为基于节点的360度全景图自回归生成,与真实VR漫游产品使用的离散导航相匹配。PanoWorld使用从平面图导出的3D外壳作为全局几何代理,以及一个动态的3D高斯泼溅缓存作为可渲染的空间记忆。为度量尺度的多房间360度输入设计的前馈全景LRM,将生成的全景图提升为局部3DGS更新,而房间感知组注意力抑制跨房间特征干扰。一种拓扑感知的渐进缓存策略融合这些局部更新,而无需重复重建完整历史。通过将基于外壳的几何引导与缓存渲染的视觉记忆解耦,PanoWorld在保持高频2D合成质量的同时,提高了跨节点的布局和材质一致性。项目链接为https://jjrcn.github.io/PanoWorld-project-home/
English
Generating a consistent whole-house VR tour from a floorplan and style reference requires both photorealistic panoramas and cross-view spatial coherence. Pure 2D generators produce appealing single panoramas but re-imagine geometry and materials when the viewpoint changes, whereas monolithic 3D generation becomes expensive and loses fine texture at multi-room scale. We introduce PanoWorld, a generative spatial world model that treats whole-house synthesis as autoregressive generation of node-based 360-degree panoramas, matching the discrete navigation used by real VR tour products. PanoWorld uses a floorplan-derived 3D shell as a global geometric proxy and a dynamic 3D Gaussian Splatting cache as renderable spatial memory. A feed-forward panoramic LRM designed for metric-scale multi-room 360-degree inputs lifts generated panoramas into local 3DGS updates, while Room-aware Group Attention suppresses cross-room feature interference. A topology-aware progressive caching strategy fuses these local updates without repeatedly reconstructing the full history. By decoupling shell-based geometry guidance from cache-rendered visual memory, PanoWorld preserves high-frequency 2D synthesis quality while improving cross-node layout and material consistency. The project link is https://jjrcn.github.io/PanoWorld-project-home/