En3D: 2D 합성 데이터로부터 3D 인간 형상 조각을 위한 향상된 생성 모델

초록

본 논문에서는 고품질 3D 인간 아바타를 조각하기 위한 향상된 생성 기법인 En3D를 소개한다. 기존 연구들이 희소한 3D 데이터셋이나 불균형한 시점과 부정확한 자세 사전 정보를 가진 제한된 2D 데이터셋에 의존하는 것과 달리, 우리의 접근 방식은 기존의 3D 또는 2D 자산에 의존하지 않고도 시각적으로 현실적이고 기하학적으로 정확하며 내용적으로 다양한 3D 인간을 생성할 수 있는 제로샷 3D 생성 기법을 개발하는 것을 목표로 한다. 이러한 도전 과제를 해결하기 위해, 우리는 합성 2D 데이터로부터 향상된 3D 생성 모델을 학습하기 위해 정확한 물리적 모델링을 구현하는 세심하게 설계된 워크플로를 도입한다. 추론 과정에서는 현실적인 외관과 거친 3D 형태 간의 격차를 줄이기 위해 최적화 모듈을 통합한다. 구체적으로, En3D는 세 가지 모듈로 구성된다: 합성된 균형적이고 다양하며 구조화된 인간 이미지로부터 현실적인 외관을 가진 일반화 가능한 3D 인간을 정확하게 모델링하는 3D 생성기; 복잡한 인간 해부학을 위해 다중 시점 법선 제약을 사용하여 형태 품질을 향상시키는 기하학적 조각가; 그리고 의미론적 UV 분할과 미분 가능한 래스터라이저를 활용하여 충실도와 편집 가능성을 갖춘 명시적 텍스처 맵을 분리하는 텍스처링 모듈. 실험 결과는 우리의 접근 방식이 이미지 품질, 기하학적 정확도 및 내용 다양성 측면에서 기존 연구를 크게 능가함을 보여준다. 또한, 우리가 생성한 아바타의 애니메이션 및 편집 가능성과 콘텐츠 스타일 자유 적응을 위한 우리 접근 방식의 확장성을 입증한다.

English

We present En3D, an enhanced generative scheme for sculpting high-quality 3D human avatars. Unlike previous works that rely on scarce 3D datasets or limited 2D collections with imbalanced viewing angles and imprecise pose priors, our approach aims to develop a zero-shot 3D generative scheme capable of producing visually realistic, geometrically accurate and content-wise diverse 3D humans without relying on pre-existing 3D or 2D assets. To address this challenge, we introduce a meticulously crafted workflow that implements accurate physical modeling to learn the enhanced 3D generative model from synthetic 2D data. During inference, we integrate optimization modules to bridge the gap between realistic appearances and coarse 3D shapes. Specifically, En3D comprises three modules: a 3D generator that accurately models generalizable 3D humans with realistic appearance from synthesized balanced, diverse, and structured human images; a geometry sculptor that enhances shape quality using multi-view normal constraints for intricate human anatomy; and a texturing module that disentangles explicit texture maps with fidelity and editability, leveraging semantical UV partitioning and a differentiable rasterizer. Experimental results show that our approach significantly outperforms prior works in terms of image quality, geometry accuracy and content diversity. We also showcase the applicability of our generated avatars for animation and editing, as well as the scalability of our approach for content-style free adaptation.

En3D: 2D 합성 데이터로부터 3D 인간 형상 조각을 위한 향상된 생성 모델

En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data

초록

Support