GALA3D：通过布局引导的生成式高斯飞溅实现文本到3D复杂场景生成

摘要

我们提出了GALA3D，即具有布局引导控制的生成式3D高斯模型，用于有效的组合文本到3D生成。我们首先利用大型语言模型（LLMs）生成初始布局，并引入了布局引导的3D高斯表示，用于具有自适应几何约束的3D内容生成。然后，我们提出了一种对象-场景组合优化机制，通过条件扩散协同生成具有一致几何、纹理、比例和准确多对象之间交互的逼真3D场景，同时调整从LLMs中提取的粗略布局先验以与生成的场景对齐。实验表明，GALA3D是一个用户友好的、端到端的框架，用于最先进的场景级3D内容生成和可控编辑，同时确保场景内对象级实体的高保真度。源代码和模型将在https://gala3d.github.io/ 上提供。

English

We present GALA3D, generative 3D GAussians with LAyout-guided control, for effective compositional text-to-3D generation. We first utilize large language models (LLMs) to generate the initial layout and introduce a layout-guided 3D Gaussian representation for 3D content generation with adaptive geometric constraints. We then propose an object-scene compositional optimization mechanism with conditioned diffusion to collaboratively generate realistic 3D scenes with consistent geometry, texture, scale, and accurate interactions among multiple objects while simultaneously adjusting the coarse layout priors extracted from the LLMs to align with the generated scene. Experiments show that GALA3D is a user-friendly, end-to-end framework for state-of-the-art scene-level 3D content generation and controllable editing while ensuring the high fidelity of object-level entities within the scene. Source codes and models will be available at https://gala3d.github.io/.

GALA3D：通过布局引导的生成式高斯飞溅实现文本到3D复杂场景生成

GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting

摘要

Support