3D-RE-GEN:基于生成式框架的室内场景三维重建
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework
December 19, 2025
作者: Tobias Sautter, Jan-Niklas Dihlmann, Hendrik P. A. Lensch
cs.AI
摘要
尽管三维场景生成技术近期取得了视觉上令人瞩目的成果,但现有表征方式难以满足视觉特效和游戏开发领域艺术家对可编辑纹理化三维网格场景的工作流需求。当前纹理网格场景重建方法虽进展显著,但存在物体分解错误、空间关系失准及背景缺失等问题,尚无法达到艺术创作实用标准。我们提出3D-RE-GEN组合框架,能够从单张图像重建包含纹理化三维物体与背景的场景。研究表明,通过整合特定领域的前沿模型,我们的方法在满足艺术创作需求的同时实现了最先进的场景重建性能。
该重建管线集成了资源检测、重建与布局模型,并将部分模型的应用范围拓展至原设计领域之外。针对被遮挡物体的获取,我们将其视为基于生成模型的图像编辑任务,通过在一致光照和几何条件下进行场景级推理来实现推断与重建。与现有方法不同,3D-RE-GEN能生成完整背景:既在优化过程中为物体提供空间约束,又为视觉特效和游戏中的真实光照与模拟任务奠定基础。为实现物理合理的布局,我们采用新颖的四自由度可微分优化方法,使重建物体与估计地平面精确对齐。3D-RE-GEN通过精确相机标定与空间优化引导的组合生成方式,在单图像三维场景重建中达到业界最优性能,生成具有一致性的可编辑场景。
English
Recent advances in 3D scene generation produce visually appealing output, but current representations hinder artists' workflows that require modifiable 3D textured mesh scenes for visual effects and game development. Despite significant advances, current textured mesh scene reconstruction methods are far from artist ready, suffering from incorrect object decomposition, inaccurate spatial relationships, and missing backgrounds. We present 3D-RE-GEN, a compositional framework that reconstructs a single image into textured 3D objects and a background. We show that combining state of the art models from specific domains achieves state of the art scene reconstruction performance, addressing artists' requirements.
Our reconstruction pipeline integrates models for asset detection, reconstruction, and placement, pushing certain models beyond their originally intended domains. Obtaining occluded objects is treated as an image editing task with generative models to infer and reconstruct with scene level reasoning under consistent lighting and geometry. Unlike current methods, 3D-RE-GEN generates a comprehensive background that spatially constrains objects during optimization and provides a foundation for realistic lighting and simulation tasks in visual effects and games. To obtain physically realistic layouts, we employ a novel 4-DoF differentiable optimization that aligns reconstructed objects with the estimated ground plane. 3D-RE-GEN~achieves state of the art performance in single image 3D scene reconstruction, producing coherent, modifiable scenes through compositional generation guided by precise camera recovery and spatial optimization.