寧靜夢想者:推動文本到3D生成中的和諧
PlacidDreamer: Advancing Harmony in Text-to-3D Generation
July 19, 2024
作者: Shuo Huang, Shikun Sun, Zixuan Wang, Xiaoyu Qin, Yanmin Xiong, Yuan Zhang, Pengfei Wan, Di Zhang, Jia Jia
cs.AI
摘要
最近,文字轉3D生成引起了相當大的關注,並取得了顯著的性能提升。先前的方法利用端對端3D生成模型來初始化3D高斯模型,多視圖擴散模型來強制實現多視圖一致性,以及文字到圖像擴散模型來通過分數蒸餾算法來精煉細節。然而,這些方法存在兩個限制。首先,它們在生成方向上存在衝突,因為不同模型旨在生成多樣化的3D資產。其次,分數蒸餾中的過飽和問題尚未得到徹底研究和解決。為了解決這些限制,我們提出了PlacidDreamer,這是一個文字轉3D框架,它通過單一多視圖擴散模型協調初始化、多視圖生成和文字條件生成,同時採用一種新穎的分數蒸餾算法來實現平衡的飽和度。為了統一生成方向,我們引入了潛在平面模組,這是一個訓練友好的插件擴展,使多視圖擴散模型能夠提供快速的幾何重建以進行初始化,並提供增強的多視圖圖像以個性化文字到圖像擴散模型。為了解決過飽和問題,我們提出將分數蒸餾視為多目標優化問題,並引入平衡分數蒸餾算法,該算法提供帕累托最優解,實現豐富細節和平衡飽和度。大量實驗驗證了我們PlacidDreamer出色的能力。代碼可在https://github.com/HansenHuang0823/PlacidDreamer找到。
English
Recently, text-to-3D generation has attracted significant attention,
resulting in notable performance enhancements. Previous methods utilize
end-to-end 3D generation models to initialize 3D Gaussians, multi-view
diffusion models to enforce multi-view consistency, and text-to-image diffusion
models to refine details with score distillation algorithms. However, these
methods exhibit two limitations. Firstly, they encounter conflicts in
generation directions since different models aim to produce diverse 3D assets.
Secondly, the issue of over-saturation in score distillation has not been
thoroughly investigated and solved. To address these limitations, we propose
PlacidDreamer, a text-to-3D framework that harmonizes initialization,
multi-view generation, and text-conditioned generation with a single multi-view
diffusion model, while simultaneously employing a novel score distillation
algorithm to achieve balanced saturation. To unify the generation direction, we
introduce the Latent-Plane module, a training-friendly plug-in extension that
enables multi-view diffusion models to provide fast geometry reconstruction for
initialization and enhanced multi-view images to personalize the text-to-image
diffusion model. To address the over-saturation problem, we propose to view
score distillation as a multi-objective optimization problem and introduce the
Balanced Score Distillation algorithm, which offers a Pareto Optimal solution
that achieves both rich details and balanced saturation. Extensive experiments
validate the outstanding capabilities of our PlacidDreamer. The code is
available at https://github.com/HansenHuang0823/PlacidDreamer.Summary
AI-Generated Summary