テキストから3D生成における分類器スコア蒸留

要旨

テキストから3D生成は近年目覚ましい進歩を遂げており、特に事前学習済みの2D拡散モデルを活用するスコア蒸留サンプリング（SDS）に基づく手法が注目されています。クラスフリーガイダンスの使用が最適化の成功に不可欠であることは広く認識されていますが、それは最も本質的な要素ではなく補助的なトリックと見なされてきました。本論文では、スコア蒸留におけるクラスフリーガイダンスの役割を再評価し、驚くべき発見をしました：ガイダンス単独でも効果的なテキストから3D生成が可能であるということです。我々はこの手法をClassifier Score Distillation（CSD）と名付け、これは暗黙的な分類モデルを生成に使用するものと解釈できます。この新しい視点は、既存の技術を理解するための新たな洞察を明らかにします。我々はCSDの有効性を、形状生成、テクスチャ合成、形状編集を含む様々なテキストから3Dタスクで検証し、最先端の手法を上回る結果を達成しました。プロジェクトページはhttps://xinyu-andy.github.io/Classifier-Score-Distillationです。

English

Text-to-3D generation has made remarkable progress recently, particularly with methods based on Score Distillation Sampling (SDS) that leverages pre-trained 2D diffusion models. While the usage of classifier-free guidance is well acknowledged to be crucial for successful optimization, it is considered an auxiliary trick rather than the most essential component. In this paper, we re-evaluate the role of classifier-free guidance in score distillation and discover a surprising finding: the guidance alone is enough for effective text-to-3D generation tasks. We name this method Classifier Score Distillation (CSD), which can be interpreted as using an implicit classification model for generation. This new perspective reveals new insights for understanding existing techniques. We validate the effectiveness of CSD across a variety of text-to-3D tasks including shape generation, texture synthesis, and shape editing, achieving results superior to those of state-of-the-art methods. Our project page is https://xinyu-andy.github.io/Classifier-Score-Distillation

テキストから3D生成における分類器スコア蒸留

Text-to-3D with classifier score distillation

要旨

Support