城市建筑师:具有布局先验的可操纵式3D城市场景生成
Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior
April 10, 2024
作者: Fan Lu, Kwan-Yee Lin, Yan Xu, Hongsheng Li, Guang Chen, Changjun Jiang
cs.AI
摘要
通过大规模文本到图像扩散模型,文本到3D生成取得了显著成功。然而,目前尚无适用于城市规模的方法论。城市场景以众多元素、错综复杂的排列关系和广阔尺度为特征,这给模糊文本描述的可解释性和有效模型优化带来了巨大障碍。本研究通过将一种构成性3D布局表示引入文本到3D范式,作为额外先验来克服这些限制。该表示包括一组具有简单几何结构和明确排列关系的语义基元,与文本描述相辅相成,实现可操控的生成。在此基础上,我们提出两项修改:(1)引入布局引导变分分数蒸馏以解决模型优化不足。它通过几何和语义约束的3D布局来调节分数蒸馏采样过程。(2) 为了处理城市场景的无限特性,我们使用可扩展哈希网格结构表示3D场景,逐渐适应城市场景不断增长的尺度。大量实验证实了我们的框架能够首次将文本到3D生成扩展到覆盖超过1000米行驶距离的大规模城市场景。我们还展示了各种场景编辑演示,展示了可操控城市场景生成的能力。网站:https://urbanarchitect.github.io。
English
Text-to-3D generation has achieved remarkable success via large-scale
text-to-image diffusion models. Nevertheless, there is no paradigm for scaling
up the methodology to urban scale. Urban scenes, characterized by numerous
elements, intricate arrangement relationships, and vast scale, present a
formidable barrier to the interpretability of ambiguous textual descriptions
for effective model optimization. In this work, we surmount the limitations by
introducing a compositional 3D layout representation into text-to-3D paradigm,
serving as an additional prior. It comprises a set of semantic primitives with
simple geometric structures and explicit arrangement relationships,
complementing textual descriptions and enabling steerable generation. Upon
this, we propose two modifications -- (1) We introduce Layout-Guided
Variational Score Distillation to address model optimization inadequacies. It
conditions the score distillation sampling process with geometric and semantic
constraints of 3D layouts. (2) To handle the unbounded nature of urban scenes,
we represent 3D scene with a Scalable Hash Grid structure, incrementally
adapting to the growing scale of urban scenes. Extensive experiments
substantiate the capability of our framework to scale text-to-3D generation to
large-scale urban scenes that cover over 1000m driving distance for the first
time. We also present various scene editing demonstrations, showing the powers
of steerable urban scene generation. Website: https://urbanarchitect.github.io.Summary
AI-Generated Summary