OmniX:從統一全景生成與感知到圖形就緒的3D場景
OmniX: From Unified Panoramic Generation and Perception to Graphics-Ready 3D Scenes
October 30, 2025
作者: Yukun Huang, Jiwen Yu, Yanning Zhou, Jianan Wang, Xintao Wang, Pengfei Wan, Xihui Liu
cs.AI
摘要
目前构建三维场景主要有两种主流方法:程序化生成与二维提升技术。其中基于全景图的二维提升技术展现出巨大潜力,该方法通过利用强大的二维生成先验知识,能够创造出具有沉浸感、真实感且多样化的三维环境。本研究对该技术进行革新,旨在生成适用于基于物理的渲染(PBR)、重光照及仿真的图形就绪型三维场景。我们的核心思路是重新定位二维生成模型,使其具备全景感知几何结构、纹理及PBR材质的能力。与现有侧重外观生成而忽略本征属性感知的二维提升方法不同,我们提出了OmniX——一个通用统一的框架。基于轻量化高效的多模态适配器结构,OmniX实现了二维生成先验知识在全景视觉任务中的复用,涵盖全景感知、生成与补全等多个维度。此外,我们还构建了大规模合成全景数据集,包含来自多样化室内外场景的高质量多模态全景样本。大量实验证明,我们的模型在全景视觉感知和图形就绪型三维场景生成方面成效显著,为沉浸式物理真实虚拟世界的生成开辟了新路径。
English
There are two prevalent ways to constructing 3D scenes: procedural generation
and 2D lifting. Among them, panorama-based 2D lifting has emerged as a
promising technique, leveraging powerful 2D generative priors to produce
immersive, realistic, and diverse 3D environments. In this work, we advance
this technique to generate graphics-ready 3D scenes suitable for physically
based rendering (PBR), relighting, and simulation. Our key insight is to
repurpose 2D generative models for panoramic perception of geometry, textures,
and PBR materials. Unlike existing 2D lifting approaches that emphasize
appearance generation and ignore the perception of intrinsic properties, we
present OmniX, a versatile and unified framework. Based on a lightweight and
efficient cross-modal adapter structure, OmniX reuses 2D generative priors for
a broad range of panoramic vision tasks, including panoramic perception,
generation, and completion. Furthermore, we construct a large-scale synthetic
panorama dataset containing high-quality multimodal panoramas from diverse
indoor and outdoor scenes. Extensive experiments demonstrate the effectiveness
of our model in panoramic visual perception and graphics-ready 3D scene
generation, opening new possibilities for immersive and physically realistic
virtual world generation.