텍스트 기반 구성적 3D 아바타 생성 및 편집

초록

우리의 목표는 텍스트 설명만을 사용하여 머리카락과 액세서리가 포함된 사실적인 3D 얼굴 아바타를 생성하는 것입니다. 이 문제는 최근 상당한 관심을 끌고 있지만, 기존 방법들은 사실감이 부족하거나 비현실적인 형태를 생성하거나 헤어스타일 수정과 같은 편집을 지원하지 못하는 한계가 있습니다. 우리는 기존 방법들이 단일 모델링 접근 방식을 사용하여 머리, 얼굴, 머리카락, 액세서리를 하나의 표현으로 처리하기 때문에 이러한 한계가 발생한다고 주장합니다. 우리의 관찰에 따르면, 예를 들어 머리카락과 얼굴은 서로 매우 다른 구조적 특성을 가지고 있어 각기 다른 표현 방식이 필요합니다. 이러한 통찰을 바탕으로, 우리는 구성적 모델을 사용하여 아바타를 생성합니다. 이 모델에서는 머리, 얼굴, 상체는 전통적인 3D 메시로 표현하고, 머리카락, 의류, 액세서리는 신경 방사 필드(NeRF)로 표현합니다. 모델 기반 메시 표현은 얼굴 영역에 강력한 기하학적 사전 정보를 제공하여 사실감을 높이고 개인의 외모 편집을 가능하게 합니다. 나머지 구성 요소를 NeRF로 표현함으로써, 우리의 방법은 곱슬머리나 푹신한 스카프와 같은 복잡한 기하학적 구조와 외관을 가진 부분을 모델링하고 합성할 수 있습니다. 우리의 새로운 시스템은 이러한 고품질의 구성적 아바타를 텍스트 설명에서 합성합니다. 실험 결과는 우리의 방법인 텍스트 기반 구성적 아바타 생성 및 편집(TECA)이 최근의 방법들보다 더 사실적인 아바타를 생성하면서도 구성적 특성 때문에 편집이 가능함을 보여줍니다. 예를 들어, 우리의 TECA는 헤어스타일, 스카프, 기타 액세서리와 같은 구성적 특징을 아바타 간에 원활하게 전송할 수 있습니다. 이 기능은 가상 피팅과 같은 응용 프로그램을 지원합니다.

English

Our goal is to create a realistic 3D facial avatar with hair and accessories using only a text description. While this challenge has attracted significant recent interest, existing methods either lack realism, produce unrealistic shapes, or do not support editing, such as modifications to the hairstyle. We argue that existing methods are limited because they employ a monolithic modeling approach, using a single representation for the head, face, hair, and accessories. Our observation is that the hair and face, for example, have very different structural qualities that benefit from different representations. Building on this insight, we generate avatars with a compositional model, in which the head, face, and upper body are represented with traditional 3D meshes, and the hair, clothing, and accessories with neural radiance fields (NeRF). The model-based mesh representation provides a strong geometric prior for the face region, improving realism while enabling editing of the person's appearance. By using NeRFs to represent the remaining components, our method is able to model and synthesize parts with complex geometry and appearance, such as curly hair and fluffy scarves. Our novel system synthesizes these high-quality compositional avatars from text descriptions. The experimental results demonstrate that our method, Text-guided generation and Editing of Compositional Avatars (TECA), produces avatars that are more realistic than those of recent methods while being editable because of their compositional nature. For example, our TECA enables the seamless transfer of compositional features like hairstyles, scarves, and other accessories between avatars. This capability supports applications such as virtual try-on.

텍스트 기반 구성적 3D 아바타 생성 및 편집

Text-Guided Generation and Editing of Compositional 3D Avatars

초록

Support