SteinDreamer:通過 Stein 恆等式進行文本到 3D 分數提煉的變異數降低
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity
December 31, 2023
作者: Peihao Wang, Zhiwen Fan, Dejia Xu, Dilin Wang, Sreyas Mohan, Forrest Iandola, Rakesh Ranjan, Yilei Li, Qiang Liu, Zhangyang Wang, Vikas Chandra
cs.AI
摘要
分數蒸餾已成為文字轉3D資產合成中最普遍的方法之一。基本上,分數蒸餾通過提升和反向傳播在不同視角上平均的分數來更新3D參數。在本文中,我們揭示了分數蒸餾中的梯度估計與高變異性有關。從減少變異性的角度來看,SDS和VSD的有效性可以被解釋為對蒸餾分數的蒙特卡羅估計器應用各種控制變量。在這種重新思考的基礎上,並基於斯坦恩恆等式,我們提出了一個更通用的解決方案來減少分數蒸餾的變異性,稱為斯坦恩分數蒸餾(SSD)。SSD包含由斯坦恩恆等式構造的控制變量,允許任意基線函數。這使我們能夠將靈活的引導先驗和網絡架構納入,以明確優化變異性的降低。在我們的實驗中,整體流程,被稱為斯坦恩夢想家,通過將控制變量實例化為單眼深度估計器來實現。結果表明,SSD可以有效降低蒸餾的變異性,並持續改善對象和場景級生成的視覺質量。此外,我們展示了斯坦恩夢想家由於更穩定的梯度更新而實現比現有方法更快的收斂。
English
Score distillation has emerged as one of the most prevalent approaches for
text-to-3D asset synthesis. Essentially, score distillation updates 3D
parameters by lifting and back-propagating scores averaged over different
views. In this paper, we reveal that the gradient estimation in score
distillation is inherent to high variance. Through the lens of variance
reduction, the effectiveness of SDS and VSD can be interpreted as applications
of various control variates to the Monte Carlo estimator of the distilled
score. Motivated by this rethinking and based on Stein's identity, we propose a
more general solution to reduce variance for score distillation, termed Stein
Score Distillation (SSD). SSD incorporates control variates constructed by
Stein identity, allowing for arbitrary baseline functions. This enables us to
include flexible guidance priors and network architectures to explicitly
optimize for variance reduction. In our experiments, the overall pipeline,
dubbed SteinDreamer, is implemented by instantiating the control variate with a
monocular depth estimator. The results suggest that SSD can effectively reduce
the distillation variance and consistently improve visual quality for both
object- and scene-level generation. Moreover, we demonstrate that SteinDreamer
achieves faster convergence than existing methods due to more stable gradient
updates.