UniDream:統一擴散先驗用於可調光文本到3D生成
UniDream: Unifying Diffusion Priors for Relightable Text-to-3D Generation
December 14, 2023
作者: Zexiang Liu, Yangguang Li, Youtian Lin, Xin Yu, Sida Peng, Yan-Pei Cao, Xiaojuan Qi, Xiaoshui Huang, Ding Liang, Wanli Ouyang
cs.AI
摘要
最近在文字轉3D生成技術方面取得了顯著進展,大大提高了將文字描述轉換為具有想像力、幾何良好且紋理精細的3D物體的能力。儘管取得了這些進展,一個普遍存在的限制來自擴散或重建模型中RGB數據的使用,這往往導致模型具有固有的照明和陰影效果,從而削弱了其逼真度,從而限制了其在需要準確重照能力的應用中的可用性。為了彌合這一差距,我們提出了UniDream,這是一個通過整合統一擴散先驗知識的文字轉3D生成框架。我們的方法包括三個主要組件:(1) 雙相訓練過程,以獲得與反照率-法線對齊的多視圖擴散和重建模型,(2) 基於Score Distillation Sample (SDS)的幾何和反照率紋理的漸進生成過程,使用訓練好的重建和擴散模型,以及(3) 將SDS創新應用於最終確定PBR生成,同時基於穩定擴散模型保持固定的反照率。廣泛的評估表明,UniDream在生成具有更清晰反照率紋理、更平滑表面、增強逼真度和卓越重照能力的3D物體方面超越了現有方法。
English
Recent advancements in text-to-3D generation technology have significantly
advanced the conversion of textual descriptions into imaginative
well-geometrical and finely textured 3D objects. Despite these developments, a
prevalent limitation arises from the use of RGB data in diffusion or
reconstruction models, which often results in models with inherent lighting and
shadows effects that detract from their realism, thereby limiting their
usability in applications that demand accurate relighting capabilities. To
bridge this gap, we present UniDream, a text-to-3D generation framework by
incorporating unified diffusion priors. Our approach consists of three main
components: (1) a dual-phase training process to get albedo-normal aligned
multi-view diffusion and reconstruction models, (2) a progressive generation
procedure for geometry and albedo-textures based on Score Distillation Sample
(SDS) using the trained reconstruction and diffusion models, and (3) an
innovative application of SDS for finalizing PBR generation while keeping a
fixed albedo based on Stable Diffusion model. Extensive evaluations demonstrate
that UniDream surpasses existing methods in generating 3D objects with clearer
albedo textures, smoother surfaces, enhanced realism, and superior relighting
capabilities.