DreamPartGen: 意味論に基づくパートレベルの3D生成における協調的潜在デノイジング

要旨

意味のある部品の構成として3Dオブジェクトを理解し生成することは、人間の知覚と推論の基盤です。しかし、ほとんどのテキストから3Dへの生成手法は、部品の意味的・機能的な構造を見落としています。近年の部品を意識したアプローチは分解を導入するものの、主に幾何学的側面に焦点が当てられており、意味的な基盤を欠き、部品がテキスト記述にどのように合致するか、または部品間の関係をモデル化できていません。我々は、意味的に基礎付けられた部品を意識したテキストから3Dへの生成フレームワーク「DreamPartGen」を提案します。DreamPartGenは、各パーツの形状と外観を共同でモデル化する「二重パーツ潜在変数（DPL）」と、言語から導出された部品間の依存関係を捕捉する「関係的意味潜在変数（RSL）」を導入します。同期化された共同デノイジングプロセスにより、幾何学的および意味的な一貫性が相互に強化され、首尾一貫した、解釈可能で、テキストに沿った3D合成が可能になります。複数のベンチマークにおいて、DreamPartGenは形状の忠実度とテキスト-形状の整合性において最先端の性能を発揮します。

English

Understanding and generating 3D objects as compositions of meaningful parts is fundamental to human perception and reasoning. However, most text-to-3D methods overlook the semantic and functional structure of parts. While recent part-aware approaches introduce decomposition, they remain largely geometry-focused, lacking semantic grounding and failing to model how parts align with textual descriptions or their inter-part relations. We propose DreamPartGen, a framework for semantically grounded, part-aware text-to-3D generation. DreamPartGen introduces Duplex Part Latents (DPLs) that jointly model each part's geometry and appearance, and Relational Semantic Latents (RSLs) that capture inter-part dependencies derived from language. A synchronized co-denoising process enforces mutual geometric and semantic consistency, enabling coherent, interpretable, and text-aligned 3D synthesis. Across multiple benchmarks, DreamPartGen delivers state-of-the-art performance in geometric fidelity and text-shape alignment.

DreamPartGen: 意味論に基づくパートレベルの3D生成における協調的潜在デノイジング

DreamPartGen: Semantically Grounded Part-Level 3D Generation via Collaborative Latent Denoising

要旨

Support