AvatarVerse：从文本和姿势实现高质量稳定的3D头像创建

摘要

从高度定制的文本描述和姿势指导中创建富有表现力、多样化和高质量的3D头像是一项具有挑战性的任务，这是由于在3D建模和纹理方面的复杂性，以确保细节和各种风格（逼真的、虚构的等）。我们提出AvatarVerse，这是一个稳定的流程，可以从纯文本描述和姿势指导中生成富有表现力的高质量3D头像。具体而言，我们引入了一个基于DensePose信号的2D扩散模型，通过2D图像建立头像的3D姿势控制，从而增强了部分观察场景的视图一致性。它解决了臭名昭著的Janus问题，并显著稳定了生成过程。此外，我们提出了一种渐进式高分辨率3D合成策略，显著提高了所创建的3D头像的质量。因此，所提出的AvatarVerse流程实现了对3D头像的零样本3D建模，这些头像不仅更具表现力，而且质量和保真度也比以往的作品更高。严格的定性评估和用户研究展示了AvatarVerse在合成高保真度3D头像方面的优越性，引领了高质量和稳定的3D头像创作新标准。我们的项目页面是：https://avatarverse3d.github.io

English

Creating expressive, diverse and high-quality 3D avatars from highly customized text descriptions and pose guidance is a challenging task, due to the intricacy of modeling and texturing in 3D that ensure details and various styles (realistic, fictional, etc). We present AvatarVerse, a stable pipeline for generating expressive high-quality 3D avatars from nothing but text descriptions and pose guidance. In specific, we introduce a 2D diffusion model conditioned on DensePose signal to establish 3D pose control of avatars through 2D images, which enhances view consistency from partially observed scenarios. It addresses the infamous Janus Problem and significantly stablizes the generation process. Moreover, we propose a progressive high-resolution 3D synthesis strategy, which obtains substantial improvement over the quality of the created 3D avatars. To this end, the proposed AvatarVerse pipeline achieves zero-shot 3D modeling of 3D avatars that are not only more expressive, but also in higher quality and fidelity than previous works. Rigorous qualitative evaluations and user studies showcase AvatarVerse's superiority in synthesizing high-fidelity 3D avatars, leading to a new standard in high-quality and stable 3D avatar creation. Our project page is: https://avatarverse3d.github.io

AvatarVerse：从文本和姿势实现高质量稳定的3D头像创建

AvatarVerse: High-quality & Stable 3D Avatar Creation from Text and Pose

摘要

Support