GALA3D:通过布局引导的生成式高斯飞溅实现文本到3D复杂场景生成
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
February 11, 2024
作者: Xiaoyu Zhou, Xingjian Ran, Yajiao Xiong, Jinlin He, Zhiwei Lin, Yongtao Wang, Deqing Sun, Ming-Hsuan Yang
cs.AI
摘要
我们提出了GALA3D,即具有布局引导控制的生成式3D高斯模型,用于有效的组合文本到3D生成。我们首先利用大型语言模型(LLMs)生成初始布局,并引入了布局引导的3D高斯表示,用于具有自适应几何约束的3D内容生成。然后,我们提出了一种对象-场景组合优化机制,通过条件扩散协同生成具有一致几何、纹理、比例和准确多对象之间交互的逼真3D场景,同时调整从LLMs中提取的粗略布局先验以与生成的场景对齐。实验表明,GALA3D是一个用户友好的、端到端的框架,用于最先进的场景级3D内容生成和可控编辑,同时确保场景内对象级实体的高保真度。源代码和模型将在https://gala3d.github.io/ 上提供。
English
We present GALA3D, generative 3D GAussians with LAyout-guided control, for
effective compositional text-to-3D generation. We first utilize large language
models (LLMs) to generate the initial layout and introduce a layout-guided 3D
Gaussian representation for 3D content generation with adaptive geometric
constraints. We then propose an object-scene compositional optimization
mechanism with conditioned diffusion to collaboratively generate realistic 3D
scenes with consistent geometry, texture, scale, and accurate interactions
among multiple objects while simultaneously adjusting the coarse layout priors
extracted from the LLMs to align with the generated scene. Experiments show
that GALA3D is a user-friendly, end-to-end framework for state-of-the-art
scene-level 3D content generation and controllable editing while ensuring the
high fidelity of object-level entities within the scene. Source codes and
models will be available at https://gala3d.github.io/.