Sketch-A-Shape:零样本草图到3D形状生成
Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation
July 8, 2023
作者: Aditya Sanghi, Pradeep Kumar Jayaraman, Arianna Rampini, Joseph Lambourne, Hooman Shayani, Evan Atherton, Saeid Asgari Taghanaki
cs.AI
摘要
最近在使用大型预训练模型在3D视觉领域的下游任务中进行创意应用方面取得了显著进展,比如文本到形状生成。这促使我们研究如何有效地利用这些预训练模型从草图中生成3D形状,这在很大程度上一直是一个开放性挑战,原因是由于有限的草图-形状配对数据集以及草图中抽象程度的差异。我们发现,在训练过程中将3D生成模型的条件设置为特征(从冻结的大型预训练视觉模型中获得)的合成渲染,使我们能够在推断时有效地从草图中生成3D形状。这表明大型预训练视觉模型的特征携带语义信号,对领域转移具有韧性,即使我们只使用RGB渲染,也能在推断时泛化到草图。我们进行了一系列全面的实验,研究了不同的设计因素,并展示了我们简单方法的有效性,可以根据输入草图生成多个3D形状,而无需在训练过程中使用任何配对数据集。
English
Significant progress has recently been made in creative applications of large
pre-trained models for downstream tasks in 3D vision, such as text-to-shape
generation. This motivates our investigation of how these pre-trained models
can be used effectively to generate 3D shapes from sketches, which has largely
remained an open challenge due to the limited sketch-shape paired datasets and
the varying level of abstraction in the sketches. We discover that conditioning
a 3D generative model on the features (obtained from a frozen large pre-trained
vision model) of synthetic renderings during training enables us to effectively
generate 3D shapes from sketches at inference time. This suggests that the
large pre-trained vision model features carry semantic signals that are
resilient to domain shifts, i.e., allowing us to use only RGB renderings, but
generalizing to sketches at inference time. We conduct a comprehensive set of
experiments investigating different design factors and demonstrate the
effectiveness of our straightforward approach for generation of multiple 3D
shapes per each input sketch regardless of their level of abstraction without
requiring any paired datasets during training.