Progressive3D：漸進式本地編輯用於具有複雜語義提示的文本轉3D內容創建

摘要

最近的文本轉3D生成方法由於影像擴散模型和優化策略的進步，實現了令人印象深刻的3D內容創作能力。然而，目前的方法在為語義複雜的提示生成正確的3D內容方面存在困難，即描述多個相互作用對象並具有不同屬性的提示。在這項工作中，我們提出了一個名為Progressive3D的通用框架，將整個生成過程分解為一系列局部漸進的編輯步驟，以為語義複雜的提示創建精確的3D內容，並且我們將內容變化限制為僅在每個編輯步驟中由用戶定義的區域提示確定的區域內發生。此外，我們提出了一種重疊的語義組件抑制技術，以鼓勵優化過程更多地關注提示之間的語義差異。大量實驗表明，所提出的Progressive3D框架為具有複雜語義提示生成精確的3D內容，並且對於由不同3D表示驅動的各種文本轉3D方法都是通用的。

English

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies. However, current methods struggle to generate correct 3D content for a complex prompt in semantics, i.e., a prompt describing multiple interacted objects binding with different attributes. In this work, we propose a general framework named Progressive3D, which decomposes the entire generation into a series of locally progressive editing steps to create precise 3D content for complex prompts, and we constrain the content change to only occur in regions determined by user-defined region prompts in each editing step. Furthermore, we propose an overlapped semantic component suppression technique to encourage the optimization process to focus more on the semantic differences between prompts. Extensive experiments demonstrate that the proposed Progressive3D framework generates precise 3D content for prompts with complex semantics and is general for various text-to-3D methods driven by different 3D representations.

Progressive3D：漸進式本地編輯用於具有複雜語義提示的文本轉3D內容創建

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

摘要

Support