PlacidDreamer:推动文本到3D生成中的和谐发展
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
July 19, 2024
作者: Shuo Huang, Shikun Sun, Zixuan Wang, Xiaoyu Qin, Yanmin Xiong, Yuan Zhang, Pengfei Wan, Di Zhang, Jia Jia
cs.AI
摘要
最近,文本到3D生成引起了广泛关注,取得了显著的性能提升。先前的方法利用端到端的3D生成模型初始化3D高斯模型,利用多视角扩散模型强制实现多视角一致性,并利用文本到图像扩散模型通过得分蒸馏算法细化细节。然而,这些方法存在两个限制。首先,它们在生成方向上存在冲突,因为不同模型旨在生成多样化的3D资源。其次,得分蒸馏中的过饱和问题尚未得到彻底调查和解决。为了解决这些限制,我们提出了PlacidDreamer,这是一个文本到3D框架,通过单一的多视角扩散模型协调初始化、多视角生成和文本条件生成,同时采用一种新颖的得分蒸馏算法实现平衡饱和度。为了统一生成方向,我们引入了潜在平面模块,这是一个训练友好的插件扩展,使多视角扩散模型能够为初始化提供快速几何重建,并提供增强的多视角图像以个性化文本到图像扩散模型。为了解决过饱和问题,我们将得分蒸馏视为一个多目标优化问题,并引入了平衡得分蒸馏算法,提供帕累托最优解,实现丰富细节和平衡饱和度。大量实验证实了我们的PlacidDreamer的出色能力。代码可在https://github.com/HansenHuang0823/PlacidDreamer找到。
English
Recently, text-to-3D generation has attracted significant attention,
resulting in notable performance enhancements. Previous methods utilize
end-to-end 3D generation models to initialize 3D Gaussians, multi-view
diffusion models to enforce multi-view consistency, and text-to-image diffusion
models to refine details with score distillation algorithms. However, these
methods exhibit two limitations. Firstly, they encounter conflicts in
generation directions since different models aim to produce diverse 3D assets.
Secondly, the issue of over-saturation in score distillation has not been
thoroughly investigated and solved. To address these limitations, we propose
PlacidDreamer, a text-to-3D framework that harmonizes initialization,
multi-view generation, and text-conditioned generation with a single multi-view
diffusion model, while simultaneously employing a novel score distillation
algorithm to achieve balanced saturation. To unify the generation direction, we
introduce the Latent-Plane module, a training-friendly plug-in extension that
enables multi-view diffusion models to provide fast geometry reconstruction for
initialization and enhanced multi-view images to personalize the text-to-image
diffusion model. To address the over-saturation problem, we propose to view
score distillation as a multi-objective optimization problem and introduce the
Balanced Score Distillation algorithm, which offers a Pareto Optimal solution
that achieves both rich details and balanced saturation. Extensive experiments
validate the outstanding capabilities of our PlacidDreamer. The code is
available at https://github.com/HansenHuang0823/PlacidDreamer.