AvatarVerse：從文字和姿勢創建高質量穩定的3D頭像

摘要

從高度定制的文字描述和姿勢指導中創建具有表現力、多樣性和高質量的3D頭像是一項具有挑戰性的任務，這是由於在3D建模和紋理方面的複雜性，確保細節和各種風格（逼真、虛構等）。我們提出了AvatarVerse，這是一個穩定的流程，可以從純文字描述和姿勢指導中生成具有表現力的高質量3D頭像。具體而言，我們引入了一個基於DensePose信號的2D擴散模型，通過2D圖像來建立頭像的3D姿勢控制，從而增強了部分觀察場景的視角一致性。它解決了臭名昭著的Janus問題，並顯著穩定了生成過程。此外，我們提出了一種漸進的高分辨率3D合成策略，顯著提高了所創建的3D頭像的質量。因此，所提出的AvatarVerse流程實現了對3D頭像的零樣本3D建模，這些頭像不僅更具表現力，而且質量和保真度也優於以往的作品。嚴格的定性評估和用戶研究展示了AvatarVerse在合成高保真度3D頭像方面的優越性，從而開創了高質量和穩定的3D頭像創作新標準。我們的項目頁面是：https://avatarverse3d.github.io

English

Creating expressive, diverse and high-quality 3D avatars from highly customized text descriptions and pose guidance is a challenging task, due to the intricacy of modeling and texturing in 3D that ensure details and various styles (realistic, fictional, etc). We present AvatarVerse, a stable pipeline for generating expressive high-quality 3D avatars from nothing but text descriptions and pose guidance. In specific, we introduce a 2D diffusion model conditioned on DensePose signal to establish 3D pose control of avatars through 2D images, which enhances view consistency from partially observed scenarios. It addresses the infamous Janus Problem and significantly stablizes the generation process. Moreover, we propose a progressive high-resolution 3D synthesis strategy, which obtains substantial improvement over the quality of the created 3D avatars. To this end, the proposed AvatarVerse pipeline achieves zero-shot 3D modeling of 3D avatars that are not only more expressive, but also in higher quality and fidelity than previous works. Rigorous qualitative evaluations and user studies showcase AvatarVerse's superiority in synthesizing high-fidelity 3D avatars, leading to a new standard in high-quality and stable 3D avatar creation. Our project page is: https://avatarverse3d.github.io

AvatarVerse：從文字和姿勢創建高質量穩定的3D頭像

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose

摘要

Support