在文本生成3D的分數蒸餾中馴服模式崩潰
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
December 31, 2023
作者: Peihao Wang, Dejia Xu, Zhiwen Fan, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra
cs.AI
摘要
儘管分數蒸餾在文本轉3D生成中表現出色,但這類技術惡名昭彰地受到視角不一致問題的困擾,也被稱為「揚紐斯」藝術品,即生成的物體用多個正面欺騙每個視角。儘管經驗上有效的方法通過分數去偏置或提示工程來解決這個問題,但對於解釋和應對這個問題的更嚴謹觀點仍然難以捉摸。在本文中,我們揭示現有基於分數蒸餾的文本轉3D生成框架退化為在每個視角上獨立尋求最大似然,因此在實踐中出現揚紐斯藝術品的模式崩潰問題。為了遏制模式崩潰,我們通過在相應的變分目標中重新引入熵項來改進分數蒸餾,該變分目標應用於渲染圖像的分佈。最大化熵鼓勵在生成的3D資產中不同視角之間的多樣性,從而緩解揚紐斯問題。基於這個新目標,我們提出了一個新的3D分數蒸餾更新規則,稱為熵分數蒸餾(ESD)。我們從理論上揭示了ESD可以通過僅採用基於變分分數蒸餾的無分類器引導技巧來簡化和實現。儘管這看似極為直接,但我們的大量實驗成功地證明了ESD可以有效地處理分數蒸餾中的揚紐斯藝術品。
English
Despite the remarkable performance of score distillation in text-to-3D
generation, such techniques notoriously suffer from view inconsistency issues,
also known as "Janus" artifact, where the generated objects fake each view with
multiple front faces. Although empirically effective methods have approached
this problem via score debiasing or prompt engineering, a more rigorous
perspective to explain and tackle this problem remains elusive. In this paper,
we reveal that the existing score distillation-based text-to-3D generation
frameworks degenerate to maximal likelihood seeking on each view independently
and thus suffer from the mode collapse problem, manifesting as the Janus
artifact in practice. To tame mode collapse, we improve score distillation by
re-establishing in entropy term in the corresponding variational objective,
which is applied to the distribution of rendered images. Maximizing the entropy
encourages diversity among different views in generated 3D assets, thereby
mitigating the Janus problem. Based on this new objective, we derive a new
update rule for 3D score distillation, dubbed Entropic Score Distillation
(ESD). We theoretically reveal that ESD can be simplified and implemented by
just adopting the classifier-free guidance trick upon variational score
distillation. Although embarrassingly straightforward, our extensive
experiments successfully demonstrate that ESD can be an effective treatment for
Janus artifacts in score distillation.