Sketch-A-Shape：零樣本草圖生成3D形狀

摘要

最近在大型預訓練模型在三維視覺領域的下游任務中的創意應用方面取得了顯著進展，例如文本到形狀生成。這促使我們探究如何有效地利用這些預訓練模型從草圖生成三維形狀，這一直是一個開放挑戰，原因在於有限的草圖-形狀配對數據集以及草圖中抽象程度的不同。我們發現，在訓練過程中將三維生成模型條件化為從凍結的大型預訓練視覺模型獲得的特徵（從合成渲染中獲得）能夠有效地使我們在推論時從草圖生成三維形狀。這表明大型預訓練視覺模型的特徵攜帶了語義信號，對領域轉移具有韌性，即使我們僅使用RGB渲染，也能在推論時泛化到草圖。我們進行了一系列全面的實驗，研究不同的設計因素，並展示了我們簡單方法的有效性，能夠根據輸入草圖生成多個三維形狀，而無需在訓練過程中使用任何配對數據集。

English

Significant progress has recently been made in creative applications of large pre-trained models for downstream tasks in 3D vision, such as text-to-shape generation. This motivates our investigation of how these pre-trained models can be used effectively to generate 3D shapes from sketches, which has largely remained an open challenge due to the limited sketch-shape paired datasets and the varying level of abstraction in the sketches. We discover that conditioning a 3D generative model on the features (obtained from a frozen large pre-trained vision model) of synthetic renderings during training enables us to effectively generate 3D shapes from sketches at inference time. This suggests that the large pre-trained vision model features carry semantic signals that are resilient to domain shifts, i.e., allowing us to use only RGB renderings, but generalizing to sketches at inference time. We conduct a comprehensive set of experiments investigating different design factors and demonstrate the effectiveness of our straightforward approach for generation of multiple 3D shapes per each input sketch regardless of their level of abstraction without requiring any paired datasets during training.

Sketch-A-Shape：零樣本草圖生成3D形狀

Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation

摘要

Support