ChatPaper.aiChatPaper

GALA3D:通過佈局引導的生成高斯飛濺,朝向文本到3D複雜場景生成

GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting

February 11, 2024
作者: Xiaoyu Zhou, Xingjian Ran, Yajiao Xiong, Jinlin He, Zhiwei Lin, Yongtao Wang, Deqing Sun, Ming-Hsuan Yang
cs.AI

摘要

我們提出了GALA3D,一種具有佈局引導控制的生成式3D高斯模型,用於有效的組合式文本轉3D生成。我們首先利用大型語言模型(LLMs)生成初始佈局,並引入了一種佈局引導的3D高斯表示法,用於具有自適應幾何約束的3D內容生成。然後,我們提出了一種對象-場景組合優化機制,搭配條件擴散,共同生成具有一致幾何、紋理、比例和準確對象間交互作用的逼真3D場景,同時調整從LLMs中提取的粗略佈局先驗,使其與生成的場景相符。實驗表明,GALA3D是一個用戶友好的端到端框架,可用於最先進的場景級3D內容生成和可控編輯,同時確保場景中對象級實體的高保真度。源代碼和模型將在https://gala3d.github.io/ 上提供。
English
We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation. We first utilize large language models (LLMs) to generate the initial layout and introduce a layout-guided 3D Gaussian representation for 3D content generation with adaptive geometric constraints. We then propose an object-scene compositional optimization mechanism with conditioned diffusion to collaboratively generate realistic 3D scenes with consistent geometry, texture, scale, and accurate interactions among multiple objects while simultaneously adjusting the coarse layout priors extracted from the LLMs to align with the generated scene. Experiments show that GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing while ensuring the high fidelity of object-level entities within the scene. Source codes and models will be available at https://gala3d.github.io/.
PDF111December 15, 2024