ProlificDreamer:利用變分分數蒸餾實現高保真度和多樣性的文本生成至3D
ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation
May 25, 2023
作者: Zhengyi Wang, Cheng Lu, Yikai Wang, Fan Bao, Chongxuan Li, Hang Su, Jun Zhu
cs.AI
摘要
分數蒸餾取樣(SDS)在文本轉3D生成中展現出巨大潛力,通過提煉預訓練的大規模文本到圖像擴散模型,但存在飽和過度、過度平滑和低多樣性問題。在這項工作中,我們建議將3D參數建模為一個隨機變量,而不是像SDS中那樣作為常數,並提出變分分數蒸餾(VSD),這是一個基於粒子的變分框架,用於解釋和解決文本轉3D生成中上述問題。我們展示了SDS是VSD的一個特例,並導致使用小型和大型CFG權重的樣本質量不佳。相比之下,VSD能夠很好地處理各種CFG權重,作為從擴散模型中祖先取樣,同時通過共同的CFG權重(即7.5)提高多樣性和樣本質量。我們進一步提出了文本到3D設計空間的各種改進,例如蒸餾時間表和密度初始化,這些改進與蒸餾算法正交,但尚未得到很好的探索。我們的整體方法被稱為ProlificDreamer,可以生成高渲染分辨率(即512x512)和高保真度的NeRF,具有豐富的結構和複雜效果(例如煙霧和水滴)。此外,從NeRF初始化,經VSD微調的網格細節豐富且逼真。項目頁面:https://ml.cs.tsinghua.edu.cn/prolificdreamer/
English
Score distillation sampling (SDS) has shown great promise in text-to-3D
generation by distilling pretrained large-scale text-to-image diffusion models,
but suffers from over-saturation, over-smoothing, and low-diversity problems.
In this work, we propose to model the 3D parameter as a random variable instead
of a constant as in SDS and present variational score distillation (VSD), a
principled particle-based variational framework to explain and address the
aforementioned issues in text-to-3D generation. We show that SDS is a special
case of VSD and leads to poor samples with both small and large CFG weights. In
comparison, VSD works well with various CFG weights as ancestral sampling from
diffusion models and simultaneously improves the diversity and sample quality
with a common CFG weight (i.e., 7.5). We further present various improvements
in the design space for text-to-3D such as distillation time schedule and
density initialization, which are orthogonal to the distillation algorithm yet
not well explored. Our overall approach, dubbed ProlificDreamer, can generate
high rendering resolution (i.e., 512times512) and high-fidelity NeRF with
rich structure and complex effects (e.g., smoke and drops). Further,
initialized from NeRF, meshes fine-tuned by VSD are meticulously detailed and
photo-realistic. Project page: https://ml.cs.tsinghua.edu.cn/prolificdreamer/Summary
AI-Generated Summary