HeadStudio:使用3D高斯分層技術將文本轉換為可動畫頭像
HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting
February 9, 2024
作者: Zhenglin Zhou, Fan Ma, Hehe Fan, Yi Yang
cs.AI
摘要
從文本提示中創建數字化頭像一直是一項令人嚮往但具有挑戰性的任務。儘管最近的研究中通過2D擴散先驗取得了令人期待的成果,但目前的方法在實現高質量和動畫頭像方面面臨挑戰。在本文中,我們提出了HeadStudio,一個新穎的框架,利用3D高斯飛濺生成逼真且動畫頭像,從文本提示中。我們的方法在中間FLAME表示形式中語義地驅動3D高斯,以創建靈活且可實現的外觀。具體來說,我們將FLAME納入3D表示和分數蒸餾中:1)基於FLAME的3D高斯飛濺,通過將每個點綁定到FLAME網格來驅動3D高斯點。2)基於FLAME的分數蒸餾採樣,利用基於FLAME的細粒控制信號來引導從文本提示中進行分數蒸餾。大量實驗證明了HeadStudio在從文本提示生成可動畫頭像方面的有效性,展示出視覺上吸引人的外觀。這些頭像能夠以1024的分辨率以高質量實時(大於等於40 fps)呈現新視圖。它們可以通過現實世界的語音和視頻平滑控制。我們希望HeadStudio能推動數字化頭像的創建,並且目前的方法可以廣泛應用於各個領域。
English
Creating digital avatars from textual prompts has long been a desirable yet
challenging task. Despite the promising outcomes obtained through 2D diffusion
priors in recent works, current methods face challenges in achieving
high-quality and animated avatars effectively. In this paper, we present
HeadStudio, a novel framework that utilizes 3D Gaussian splatting to
generate realistic and animated avatars from text prompts. Our method drives 3D
Gaussians semantically to create a flexible and achievable appearance through
the intermediate FLAME representation. Specifically, we incorporate the FLAME
into both 3D representation and score distillation: 1) FLAME-based 3D Gaussian
splatting, driving 3D Gaussian points by rigging each point to a FLAME mesh. 2)
FLAME-based score distillation sampling, utilizing FLAME-based fine-grained
control signal to guide score distillation from the text prompt. Extensive
experiments demonstrate the efficacy of HeadStudio in generating animatable
avatars from textual prompts, exhibiting visually appealing appearances. The
avatars are capable of rendering high-quality real-time (geq 40 fps) novel
views at a resolution of 1024. They can be smoothly controlled by real-world
speech and video. We hope that HeadStudio can advance digital avatar creation
and that the present method can widely be applied across various domains.