Coin3D: プロキシ誘導型条件付けによる制御可能でインタラクティブな3Dアセット生成

要旨

人間として、私たちは自由意志に基づきながらも容易に制御可能なメディアコンテンツの創造を目指しています。生成技術の目覚ましい発展により、現在では2D拡散法を活用して、ラフスケッチや指定された人間のポーズに基づいて画像を合成したり、マスクを用いたインペインティングで局所領域を段階的に編集・再生成することが容易に可能となっています。しかし、3Dモデリングタスクにおいて同様のワークフローは、3D生成における制御性と効率性の欠如により、まだ実現されていません。本論文では、Coin3Dと名付けた、制御可能でインタラクティブな3Dアセットモデリングフレームワークを提案します。Coin3Dは、ユーザーが基本的な形状から組み立てた粗いジオメトリプロキシを用いて3D生成を制御することを可能にし、シームレスな局所パーツ編集をサポートするとともに、数秒以内に応答性の高い3Dオブジェクトプレビューを提供するインタラクティブ生成ワークフローを導入します。これを実現するため、拡散モデルに体積的な粗形状制御を適用する3Dアダプター、精密なパーツ編集のためのプロキシ境界編集戦略、応答性の高いプレビューをサポートするプログレッシブボリュームキャッシュ、一貫したメッシュ再構築を保証するVolume-SDSなど、いくつかの技術を開発しました。多様な形状プロキシを用いたインタラクティブ生成と編集に関する広範な実験により、本手法が3Dアセット生成タスクにおいて優れた制御性と柔軟性を実現することを示します。

English

As humans, we aspire to create media content that is both freely willed and readily controlled. Thanks to the prominent development of generative techniques, we now can easily utilize 2D diffusion methods to synthesize images controlled by raw sketch or designated human poses, and even progressively edit/regenerate local regions with masked inpainting. However, similar workflows in 3D modeling tasks are still unavailable due to the lack of controllability and efficiency in 3D generation. In this paper, we present a novel controllable and interactive 3D assets modeling framework, named Coin3D. Coin3D allows users to control the 3D generation using a coarse geometry proxy assembled from basic shapes, and introduces an interactive generation workflow to support seamless local part editing while delivering responsive 3D object previewing within a few seconds. To this end, we develop several techniques, including the 3D adapter that applies volumetric coarse shape control to the diffusion model, proxy-bounded editing strategy for precise part editing, progressive volume cache to support responsive preview, and volume-SDS to ensure consistent mesh reconstruction. Extensive experiments of interactive generation and editing on diverse shape proxies demonstrate that our method achieves superior controllability and flexibility in the 3D assets generation task.

Coin3D: プロキシ誘導型条件付けによる制御可能でインタラクティブな3Dアセット生成

Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

要旨

Support