Progressive3D：複雑な意味的プロンプトを用いたテキストから3Dコンテンツ作成のための段階的ローカル編集

要旨

近年のテキストから3D生成への手法は、画像拡散モデルと最適化戦略の進展により、印象的な3Dコンテンツ作成能力を実現している。しかし、現在の手法は、複数の相互作用するオブジェクトが異なる属性と結びついた複雑な意味を持つプロンプトに対して、正確な3Dコンテンツを生成するのに苦戦している。本研究では、Progressive3Dと名付けた一般的なフレームワークを提案し、複雑なプロンプトに対して正確な3Dコンテンツを作成するために、生成プロセスを一連の局所的な段階的編集ステップに分解する。さらに、各編集ステップにおいて、ユーザー定義の領域プロンプトによって決定された領域のみでコンテンツの変更が発生するように制約をかける。加えて、プロンプト間の意味的差異に最適化プロセスがより焦点を当てることを促すために、重複する意味的コンポーネント抑制技術を提案する。広範な実験により、提案するProgressive3Dフレームワークが、複雑な意味を持つプロンプトに対して正確な3Dコンテンツを生成し、異なる3D表現に基づく様々なテキストから3D生成手法に対して一般的に適用可能であることが示された。

English

Recent text-to-3D generation methods achieve impressive 3D content creation capacity thanks to the advances in image diffusion models and optimizing strategies. However, current methods struggle to generate correct 3D content for a complex prompt in semantics, i.e., a prompt describing multiple interacted objects binding with different attributes. In this work, we propose a general framework named Progressive3D, which decomposes the entire generation into a series of locally progressive editing steps to create precise 3D content for complex prompts, and we constrain the content change to only occur in regions determined by user-defined region prompts in each editing step. Furthermore, we propose an overlapped semantic component suppression technique to encourage the optimization process to focus more on the semantic differences between prompts. Extensive experiments demonstrate that the proposed Progressive3D framework generates precise 3D content for prompts with complex semantics and is general for various text-to-3D methods driven by different 3D representations.

Progressive3D：複雑な意味的プロンプトを用いたテキストから3Dコンテンツ作成のための段階的ローカル編集

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

要旨

Support