Hi3DGen: 노멀 브리징을 통한 이미지 기반 고해상도 3D 형상 생성

초록

2D 이미지에서 고해상도 3D 모델에 대한 수요가 증가함에 따라, 기존 방법들은 도메인 간격의 한계와 RGB 이미지의 본질적인 모호성으로 인해 미세한 기하학적 디테일을 정확하게 재현하는 데 여전히 상당한 어려움을 겪고 있습니다. 이러한 문제를 해결하기 위해, 우리는 노멀 맵을 중간 표현으로 활용하여 이미지로부터 고해상도 3D 기하학을 생성하는 새로운 프레임워크인 Hi3DGen을 제안합니다. Hi3DGen은 세 가지 주요 구성 요소로 이루어져 있습니다: (1) 노이즈 주입과 듀얼 스트림 학습을 통해 저주파-고주파 이미지 패턴을 분리하여 일반화 가능하고 안정적이며 선명한 추정을 달성하는 이미지-투-노멀 추정기, (2) 노멀 정규화된 잠재 확산 학습을 사용하여 3D 기하학 생성의 충실도를 향상시키는 노멀-투-기하학 학습 접근법, 그리고 (3) 고품질 데이터셋을 구축하여 학습을 지원하는 3D 데이터 합성 파이프라인. 광범위한 실험을 통해 우리의 프레임워크가 풍부한 기하학적 디테일을 생성하는 데 있어서의 효과성과 우수성을 입증하였으며, 충실도 측면에서 최신 기술을 능가하는 성능을 보였습니다. 우리의 연구는 노멀 맵을 중간 표현으로 활용함으로써 이미지로부터 고해상도 3D 기하학을 생성하는 새로운 방향을 제시합니다.

English

With the growing demand for high-fidelity 3D models from 2D images, existing methods still face significant challenges in accurately reproducing fine-grained geometric details due to limitations in domain gaps and inherent ambiguities in RGB images. To address these issues, we propose Hi3DGen, a novel framework for generating high-fidelity 3D geometry from images via normal bridging. Hi3DGen consists of three key components: (1) an image-to-normal estimator that decouples the low-high frequency image pattern with noise injection and dual-stream training to achieve generalizable, stable, and sharp estimation; (2) a normal-to-geometry learning approach that uses normal-regularized latent diffusion learning to enhance 3D geometry generation fidelity; and (3) a 3D data synthesis pipeline that constructs a high-quality dataset to support training. Extensive experiments demonstrate the effectiveness and superiority of our framework in generating rich geometric details, outperforming state-of-the-art methods in terms of fidelity. Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.

Hi3DGen: 노멀 브리징을 통한 이미지 기반 고해상도 3D 형상 생성

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

초록

Support