ShowRoom3D:使用3D先驗生成高質量3D房間的文本
ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors
December 20, 2023
作者: Weijia Mao, Yan-Pei Cao, Jia-Wei Liu, Zhongcong Xu, Mike Zheng Shou
cs.AI
摘要
我們介紹了ShowRoom3D,一種從文字生成高質量3D室內場景的三階段方法。先前使用2D擴散先驗來優化神經輻射場以生成室內場景的方法顯示出不滿意的質量。這主要歸因於2D先驗缺乏3D意識和訓練方法中的限制。在本文中,我們利用3D擴散先驗MVDiffusion來優化3D室內場景。我們的貢獻有兩個方面。首先,我們提出了一個漸進式視角選擇過程來優化NeRF。這涉及將訓練過程分為三個階段,逐漸擴大相機採樣範圍。其次,我們在第二階段提出了姿勢轉換方法。這將確保MVDiffusion提供準確的視角指導。因此,ShowRoom3D使得生成的房間具有改善的結構完整性,從任何視角都有增強的清晰度,減少內容重複,並在不同視角之間具有更高的一致性。大量實驗表明,我們的方法在用戶研究方面明顯優於最先進的方法。
English
We introduce ShowRoom3D, a three-stage approach for generating high-quality
3D room-scale scenes from texts. Previous methods using 2D diffusion priors to
optimize neural radiance fields for generating room-scale scenes have shown
unsatisfactory quality. This is primarily attributed to the limitations of 2D
priors lacking 3D awareness and constraints in the training methodology. In
this paper, we utilize a 3D diffusion prior, MVDiffusion, to optimize the 3D
room-scale scene. Our contributions are in two aspects. Firstly, we propose a
progressive view selection process to optimize NeRF. This involves dividing the
training process into three stages, gradually expanding the camera sampling
scope. Secondly, we propose the pose transformation method in the second stage.
It will ensure MVDiffusion provide the accurate view guidance. As a result,
ShowRoom3D enables the generation of rooms with improved structural integrity,
enhanced clarity from any view, reduced content repetition, and higher
consistency across different perspectives. Extensive experiments demonstrate
that our method, significantly outperforms state-of-the-art approaches by a
large margin in terms of user study.