空間彫刻アウトペインティングを用いた単一画像からの360度再構成

要旨

本論文では、単一画像から完全な360度視野の3Dモデルを生成する新しいフレームワーク「POP3D」を紹介します。POP3Dは、単一視点再構築を制限する2つの主要な課題を解決します。第一に、POP3Dは任意のカテゴリーに対して高い汎用性を提供し、これは従来の手法が達成に苦労していた特性です。第二に、POP3Dは再構築の忠実度と自然さをさらに向上させ、これは同時期の研究が十分に達成できていない重要な側面です。我々のアプローチは、以下の4つの主要コンポーネントの強みを組み合わせています：(1) 重要な幾何学的な手がかりを予測する単眼深度および法線予測器、(2) 対象物体の潜在的に見えない部分を区画化できる空間カービング手法、(3) 大規模な画像データセットで事前学習された生成モデルで、対象の見えない領域を補完可能、(4) RGB画像と単眼幾何学的な手がかりを使用して物体を再構築するために調整されたニューラル暗黙的表面再構築手法。これらのコンポーネントの組み合わせにより、POP3Dは様々な実世界の画像に容易に汎化し、最先端の再構築を生成し、類似の研究を大きく上回る性能を発揮します。プロジェクトページ: http://cg.postech.ac.kr/research/POP3D

English

We introduce POP3D, a novel framework that creates a full 360^circ-view 3D model from a single image. POP3D resolves two prominent issues that limit the single-view reconstruction. Firstly, POP3D offers substantial generalizability to arbitrary categories, a trait that previous methods struggle to achieve. Secondly, POP3D further improves reconstruction fidelity and naturalness, a crucial aspect that concurrent works fall short of. Our approach marries the strengths of four primary components: (1) a monocular depth and normal predictor that serves to predict crucial geometric cues, (2) a space carving method capable of demarcating the potentially unseen portions of the target object, (3) a generative model pre-trained on a large-scale image dataset that can complete unseen regions of the target, and (4) a neural implicit surface reconstruction method tailored in reconstructing objects using RGB images along with monocular geometric cues. The combination of these components enables POP3D to readily generalize across various in-the-wild images and generate state-of-the-art reconstructions, outperforming similar works by a significant margin. Project page: http://cg.postech.ac.kr/research/POP3D

空間彫刻アウトペインティングを用いた単一画像からの360度再構成

360^circ Reconstruction From a Single Image Using Space Carved Outpainting

要旨

Support