PanoWorld:一種用於一致性全屋全景合成的生成式空間世界模型
PanoWorld: A Generative Spatial World Model for Consistent Whole-House Panorama Synthesis
May 19, 2026
作者: Jinrang Jia, Zhenjia Li, Yijiang Hu, Yifeng Shi
cs.AI
摘要
從平面圖和風格參考生成一致的全屋VR導覽,需要兼顧逼真的全景圖與跨視角的空間一致性。純2D生成器能產出吸引人的單一全景圖,但當視角改變時會重新想像幾何結構與材質;而整體3D生成則成本高昂,且在多房間尺度下難以保留精細紋理。我們提出PanoWorld,這是一個生成式空間世界模型,將全屋合成視為基於節點之360度全景圖的自回歸生成,符合真實VR導覽產品所使用的離散導航模式。PanoWorld使用由平面圖衍生的3D外殼作為全局幾何代理,並以動態3D高斯潑濺快取作為可渲染的空間記憶。專為度量尺度多房間360度輸入設計的前饋全景大型重建模型,能將生成的全景圖提升為局部3DGS更新;而房間感知分組注意力則抑制跨房間的特徵干擾。拓撲感知的漸進式快取策略融合這些局部更新,無需反覆重建完整歷史。透過將基於外殼的幾何引導與基於快取渲染的可視記憶解耦,PanoWorld在保持高頻2D合成品質的同時,改善了跨節點的佈局與材質一致性。專案連結為 https://jjrcn.github.io/PanoWorld-project-home/
English
Generating a consistent whole-house VR tour from a floorplan and style reference requires both photorealistic panoramas and cross-view spatial coherence. Pure 2D generators produce appealing single panoramas but re-imagine geometry and materials when the viewpoint changes, whereas monolithic 3D generation becomes expensive and loses fine texture at multi-room scale. We introduce PanoWorld, a generative spatial world model that treats whole-house synthesis as autoregressive generation of node-based 360-degree panoramas, matching the discrete navigation used by real VR tour products. PanoWorld uses a floorplan-derived 3D shell as a global geometric proxy and a dynamic 3D Gaussian Splatting cache as renderable spatial memory. A feed-forward panoramic LRM designed for metric-scale multi-room 360-degree inputs lifts generated panoramas into local 3DGS updates, while Room-aware Group Attention suppresses cross-room feature interference. A topology-aware progressive caching strategy fuses these local updates without repeatedly reconstructing the full history. By decoupling shell-based geometry guidance from cache-rendered visual memory, PanoWorld preserves high-frequency 2D synthesis quality while improving cross-node layout and material consistency. The project link is https://jjrcn.github.io/PanoWorld-project-home/