ICON: ジョイントポーズとラジアンスフィールド最適化のためのインクリメンタル信頼度

要旨

ニューラルラジアンスフィールド（NeRF）は、2D画像のセットを与えられた場合の新視点合成（NVS）において顕著な性能を発揮します。しかし、NeRFの学習には、通常Structure-from-Motion（SfM）パイプラインによって取得される各入力ビューの正確なカメラポーズが必要です。最近の研究ではこの制約を緩和しようと試みていますが、それでもしばしば適切な初期ポーズに依存し、それを洗練する必要があります。ここでは、ポーズ初期化の要件を完全に取り除くことを目指します。我々は、2DビデオフレームからNeRFを学習するための最適化手順であるIncremental CONfidence（ICON）を提案します。ICONは、滑らかなカメラ運動を仮定して初期ポーズの推定を行います。さらに、ICONは「信頼度」を導入します。これはモデルの品質を適応的に測定し、動的に勾配を再重み付けするための指標です。ICONは、高信頼度のポーズを利用してNeRFを学習し、高信頼度の3D構造（NeRFによって符号化されたもの）を利用してポーズを学習します。我々は、事前のポーズ初期化なしに、ICONがCO3DとHO3DにおいてSfMポーズを使用する手法よりも優れた性能を達成することを示します。

English

Neural Radiance Fields (NeRF) exhibit remarkable performance for Novel View Synthesis (NVS) given a set of 2D images. However, NeRF training requires accurate camera pose for each input view, typically obtained by Structure-from-Motion (SfM) pipelines. Recent works have attempted to relax this constraint, but they still often rely on decent initial poses which they can refine. Here we aim at removing the requirement for pose initialization. We present Incremental CONfidence (ICON), an optimization procedure for training NeRFs from 2D video frames. ICON only assumes smooth camera motion to estimate initial guess for poses. Further, ICON introduces ``confidence": an adaptive measure of model quality used to dynamically reweight gradients. ICON relies on high-confidence poses to learn NeRF, and high-confidence 3D structure (as encoded by NeRF) to learn poses. We show that ICON, without prior pose initialization, achieves superior performance in both CO3D and HO3D versus methods which use SfM pose.

ICON: ジョイントポーズとラジアンスフィールド最適化のためのインクリメンタル信頼度

ICON: Incremental CONfidence for Joint Pose and Radiance Field Optimization

要旨

Support