Hi3DGen: 画像からの高精細3Dジオメトリ生成における法線ブリッジング

要旨

2D画像からの高精細3Dモデルに対する需要の高まりに伴い、既存の手法は、ドメインギャップの制約やRGB画像に内在する曖昧さのため、微細な幾何学的詳細を正確に再現する上で依然として大きな課題に直面しています。これらの問題に対処するため、本論文ではHi3DGenを提案します。これは、法線マップを橋渡しとして画像から高精細な3Dジオメトリを生成する新しいフレームワークです。Hi3DGenは3つの主要なコンポーネントで構成されています：(1) ノイズ注入とデュアルストリームトレーニングを用いて低周波・高周波の画像パターンを分離し、汎用的で安定かつ鮮明な推定を実現する画像から法線マップへの推定器、(2) 法線正則化潜在拡散学習を用いて3Dジオメトリ生成の忠実度を向上させる法線マップからジオメトリへの学習手法、(3) トレーニングを支援する高品質なデータセットを構築する3Dデータ合成パイプラインです。広範な実験により、本フレームワークが豊富な幾何学的詳細を生成する上での有効性と優位性が実証され、忠実度の点で最先端の手法を凌駕することが示されました。本研究は、法線マップを中間表現として活用することで、画像からの高精細3Dジオメトリ生成の新たな方向性を提供します。

English

With the growing demand for high-fidelity 3D models from 2D images, existing methods still face significant challenges in accurately reproducing fine-grained geometric details due to limitations in domain gaps and inherent ambiguities in RGB images. To address these issues, we propose Hi3DGen, a novel framework for generating high-fidelity 3D geometry from images via normal bridging. Hi3DGen consists of three key components: (1) an image-to-normal estimator that decouples the low-high frequency image pattern with noise injection and dual-stream training to achieve generalizable, stable, and sharp estimation; (2) a normal-to-geometry learning approach that uses normal-regularized latent diffusion learning to enhance 3D geometry generation fidelity; and (3) a 3D data synthesis pipeline that constructs a high-quality dataset to support training. Extensive experiments demonstrate the effectiveness and superiority of our framework in generating rich geometric details, outperforming state-of-the-art methods in terms of fidelity. Our work provides a new direction for high-fidelity 3D geometry generation from images by leveraging normal maps as an intermediate representation.

Hi3DGen: 画像からの高精細3Dジオメトリ生成における法線ブリッジング

Hi3DGen: High-fidelity 3D Geometry Generation from Images via Normal Bridging

要旨

Support