GenRecon:橋接生成先驗的多視角三維場景重建
GenRecon: Bridging Generative Priors for Multi-View 3D Scene Reconstruction
May 22, 2026
作者: Katharina Schmid, Nicolas von Lützow, Jozef Hladký, Angela Dai, Matthias Nießner
cs.AI
摘要
我們提出了一種新方法,用於從多視角RGB影像中進行高保真3D場景重建,該方法將重建與強大的生成式3D先驗緊密結合。我們將場景重建視為對一系列空間局部且重疊的區塊進行條件式3D生成,這些區塊共同覆蓋整個場景,從而將生成規模擴展至大型場景。關鍵在於,我們繼承了最先進生成形狀模型(以Trellis.2為例)的保真度與完整性,並將其推廣至場景層級。為此,我們提出了一種基於投影的條件機制,該機制將帶有姿態的多視角影像特徵提升為與生成模型對齊的連貫3D表示,且不受視角順序影響,並空間錨定於場景,從而產生高保真、多視角一致的生成幾何。這使得我們能將Trellis.2的強物件級先驗提升至多視角場景級生成,產生室內環境的逼真、可編輯PBR網格重建。最終,我們獲得了超越最先進重建方法16%的高保真成果。
English
We introduce a new approach to high-fidelity 3D scene reconstruction from multi-view RGB images that tightly couples reconstruction with a strong generative 3D prior. We cast scene reconstruction as conditional 3D generation over a set of spatially-localized, overlapping chunks that together tile the scene, scaling generation to large scene extents. Crucially, we inherit the fidelity and completeness of state-of-the-art generative shape models -- we use Trellis.2 as an example -- which we generalize to the scene level. To this end, we propose a projection-based conditioning mechanism that lifts posed multi-view image features into a coherent 3D representation aligned with the generative model, independent of view ordering and spatially anchored to the scene, yielding high-fidelity, multi-view consistent generated geometry. This enables lifting the strong object-level prior of Trellis.2 to multi-view, scene-scale generation, producing faithful, editable PBR mesh reconstructions of indoor environments. As a result, we obtain high-fidelity results that outperform cutting-edge reconstruction methods by 16%.