ChatPaper.aiChatPaper

Sketch-A-Shape:零样本草图到3D形状生成

Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation

July 8, 2023
作者: Aditya Sanghi, Pradeep Kumar Jayaraman, Arianna Rampini, Joseph Lambourne, Hooman Shayani, Evan Atherton, Saeid Asgari Taghanaki
cs.AI

摘要

最近在使用大型预训练模型在3D视觉领域的下游任务中进行创意应用方面取得了显著进展,比如文本到形状生成。这促使我们研究如何有效地利用这些预训练模型从草图中生成3D形状,这在很大程度上一直是一个开放性挑战,原因是由于有限的草图-形状配对数据集以及草图中抽象程度的差异。我们发现,在训练过程中将3D生成模型的条件设置为特征(从冻结的大型预训练视觉模型中获得)的合成渲染,使我们能够在推断时有效地从草图中生成3D形状。这表明大型预训练视觉模型的特征携带语义信号,对领域转移具有韧性,即使我们只使用RGB渲染,也能在推断时泛化到草图。我们进行了一系列全面的实验,研究了不同的设计因素,并展示了我们简单方法的有效性,可以根据输入草图生成多个3D形状,而无需在训练过程中使用任何配对数据集。
English
Significant progress has recently been made in creative applications of large pre-trained models for downstream tasks in 3D vision, such as text-to-shape generation. This motivates our investigation of how these pre-trained models can be used effectively to generate 3D shapes from sketches, which has largely remained an open challenge due to the limited sketch-shape paired datasets and the varying level of abstraction in the sketches. We discover that conditioning a 3D generative model on the features (obtained from a frozen large pre-trained vision model) of synthetic renderings during training enables us to effectively generate 3D shapes from sketches at inference time. This suggests that the large pre-trained vision model features carry semantic signals that are resilient to domain shifts, i.e., allowing us to use only RGB renderings, but generalizing to sketches at inference time. We conduct a comprehensive set of experiments investigating different design factors and demonstrate the effectiveness of our straightforward approach for generation of multiple 3D shapes per each input sketch regardless of their level of abstraction without requiring any paired datasets during training.
PDF231December 15, 2024