GECO: 1秒以内の生成的な画像から3Dへの変換

要旨

3D生成技術は近年目覚ましい進歩を遂げています。既存の手法、例えばスコア蒸留法は顕著な結果を生み出しますが、シーンごとの最適化に多大な時間を要し、効率性に影響を与えます。一方、再構築ベースのアプローチは効率性を重視しますが、不確実性の扱いが限定的であるため品質が犠牲になります。本論文では、GECOという高品質な3D生成モデリングを1秒以内で行う新たな手法を紹介します。我々のアプローチは、現在の手法が抱える不確実性と非効率性という普遍的な課題に対処するため、2段階のプロセスを採用しています。最初の段階では、スコア蒸留を用いて単一ステップのマルチビュー生成モデルを学習します。その後、第2段階の蒸留を適用し、マルチビュー予測から生じるビュー間の不整合性の問題を解決します。この2段階プロセスにより、3D生成において品質と効率性の両方を最適化するバランスの取れたアプローチを実現します。我々の包括的な実験により、GECOが前例のないレベルの効率性で高品質な画像から3Dへの生成を達成することが実証されました。

English

3D generation has seen remarkable progress in recent years. Existing techniques, such as score distillation methods, produce notable results but require extensive per-scene optimization, impacting time efficiency. Alternatively, reconstruction-based approaches prioritize efficiency but compromise quality due to their limited handling of uncertainty. We introduce GECO, a novel method for high-quality 3D generative modeling that operates within a second. Our approach addresses the prevalent issues of uncertainty and inefficiency in current methods through a two-stage approach. In the initial stage, we train a single-step multi-view generative model with score distillation. Then, a second-stage distillation is applied to address the challenge of view inconsistency from the multi-view prediction. This two-stage process ensures a balanced approach to 3D generation, optimizing both quality and efficiency. Our comprehensive experiments demonstrate that GECO achieves high-quality image-to-3D generation with an unprecedented level of efficiency.

GECO: 1秒以内の生成的な画像から3Dへの変換

GECO: Generative Image-to-3D within a SECOnd

要旨

Support