ChatPaper.aiChatPaper

具有佈局學習的解耦式3D場景生成

Disentangled 3D Scene Generation with Layout Learning

February 26, 2024
作者: Dave Epstein, Ben Poole, Ben Mildenhall, Alexei A. Efros, Aleksander Holynski
cs.AI

摘要

我們介紹了一種方法來生成被解開成其組成物件的3D場景。這種解開是無監督的,僅依賴於一個大型預訓練的文本到圖像模型的知識。我們的關鍵見解是,通過找到3D場景的部分,在空間上重新排列時仍然產生相同場景的有效配置,可以發現物件。具體來說,我們的方法從頭開始聯合優化多個NeRF模型 - 每個模型代表其自己的物件 - 以及一組將這些物件合成場景的佈局。然後,我們鼓勵這些合成的場景根據圖像生成器處於分佈中。我們展示了,儘管其簡單性,我們的方法成功生成了被分解為個別物件的3D場景,從而在文本到3D內容創作中實現了新的能力。有關結果和互動演示,請參見我們的項目頁面:https://dave.ml/layoutlearning/
English
We introduce a method to generate 3D scenes that are disentangled into their component objects. This disentanglement is unsupervised, relying only on the knowledge of a large pretrained text-to-image model. Our key insight is that objects can be discovered by finding parts of a 3D scene that, when rearranged spatially, still produce valid configurations of the same scene. Concretely, our method jointly optimizes multiple NeRFs from scratch - each representing its own object - along with a set of layouts that composite these objects into scenes. We then encourage these composited scenes to be in-distribution according to the image generator. We show that despite its simplicity, our approach successfully generates 3D scenes decomposed into individual objects, enabling new capabilities in text-to-3D content creation. For results and an interactive demo, see our project page at https://dave.ml/layoutlearning/
PDF121December 15, 2024