语义分数蒸馏采样用于组合文本到3D生成
Semantic Score Distillation Sampling for Compositional Text-to-3D Generation
October 11, 2024
作者: Ling Yang, Zixiang Zhang, Junlin Han, Bohan Zeng, Runjia Li, Philip Torr, Wentao Zhang
cs.AI
摘要
从文本描述中生成高质量的3D资产仍然是计算机图形学和视觉研究中的一个关键挑战。由于3D数据的稀缺性,最先进的方法利用经过预训练的2D扩散先验,通过得分蒸馏采样(SDS)进行优化。尽管取得了进展,但制作包含多个对象或复杂交互的复杂3D场景仍然很困难。为了解决这个问题,最近的方法已经纳入了盒状或布局指导。然而,这些布局引导的组合方法通常难以提供细粒度控制,因为它们通常是粗糙的且缺乏表现力。为了克服这些挑战,我们引入了一种新颖的SDS方法,称为语义得分蒸馏采样(SemanticSDS),旨在有效提高组合文本到3D生成的表现力和准确性。我们的方法集成了新的语义嵌入,能够在不同的渲染视图之间保持一致性,并清晰区分各种对象和部分。这些嵌入被转换为语义地图,指导区域特定的SDS过程,实现精确优化和组合生成。通过利用明确的语义指导,我们的方法释放了现有预训练扩散模型的组合能力,从而在3D内容生成中取得了卓越的质量,特别是对于复杂对象和场景。实验结果表明,我们的SemanticSDS框架非常有效地生成最先进的复杂3D内容。 代码: https://github.com/YangLing0818/SemanticSDS-3D
English
Generating high-quality 3D assets from textual descriptions remains a pivotal
challenge in computer graphics and vision research. Due to the scarcity of 3D
data, state-of-the-art approaches utilize pre-trained 2D diffusion priors,
optimized through Score Distillation Sampling (SDS). Despite progress, crafting
complex 3D scenes featuring multiple objects or intricate interactions is still
difficult. To tackle this, recent methods have incorporated box or layout
guidance. However, these layout-guided compositional methods often struggle to
provide fine-grained control, as they are generally coarse and lack
expressiveness. To overcome these challenges, we introduce a novel SDS
approach, Semantic Score Distillation Sampling (SemanticSDS), designed to
effectively improve the expressiveness and accuracy of compositional text-to-3D
generation. Our approach integrates new semantic embeddings that maintain
consistency across different rendering views and clearly differentiate between
various objects and parts. These embeddings are transformed into a semantic
map, which directs a region-specific SDS process, enabling precise optimization
and compositional generation. By leveraging explicit semantic guidance, our
method unlocks the compositional capabilities of existing pre-trained diffusion
models, thereby achieving superior quality in 3D content generation,
particularly for complex objects and scenes. Experimental results demonstrate
that our SemanticSDS framework is highly effective for generating
state-of-the-art complex 3D content. Code:
https://github.com/YangLing0818/SemanticSDS-3DSummary
AI-Generated Summary