ZeroAvatar:從單張圖像生成零樣式3D頭像
ZeroAvatar: Zero-shot 3D Avatar Generation from a Single Image
May 25, 2023
作者: Zhenzhen Weng, Zeyu Wang, Serena Yeung
cs.AI
摘要
最近在文本到圖像生成方面取得的進展已經顯著促進了零樣本3D形狀生成的進步。這是通過得分蒸餾實現的,該方法利用預先訓練的文本到圖像擴散模型來優化3D神經表示的參數,例如神經輻射場(NeRF)。儘管顯示出有希望的結果,現有方法通常無法保留複雜形狀(例如人體)的幾何形狀。為了應對這一挑戰,我們提出了ZeroAvatar,這是一種在優化過程中引入明確的3D人體先驗的方法。具體而言,我們首先從單張圖像中估計並微調參數化人體的參數。然後在優化過程中,我們使用姿態參數化人體作為額外的幾何約束來規範擴散模型以及基礎密度場。最後,我們提出了一個UV引導的紋理規範項,進一步引導在不可見的身體部位完成紋理。我們展示了ZeroAvatar顯著增強了基於優化的圖像到3D頭像生成的魯棒性和3D一致性,優於現有的零樣本圖像到3D方法。
English
Recent advancements in text-to-image generation have enabled significant
progress in zero-shot 3D shape generation. This is achieved by score
distillation, a methodology that uses pre-trained text-to-image diffusion
models to optimize the parameters of a 3D neural presentation, e.g. Neural
Radiance Field (NeRF). While showing promising results, existing methods are
often not able to preserve the geometry of complex shapes, such as human
bodies. To address this challenge, we present ZeroAvatar, a method that
introduces the explicit 3D human body prior to the optimization process.
Specifically, we first estimate and refine the parameters of a parametric human
body from a single image. Then during optimization, we use the posed parametric
body as additional geometry constraint to regularize the diffusion model as
well as the underlying density field. Lastly, we propose a UV-guided texture
regularization term to further guide the completion of texture on invisible
body parts. We show that ZeroAvatar significantly enhances the robustness and
3D consistency of optimization-based image-to-3D avatar generation,
outperforming existing zero-shot image-to-3D methods.