FrugalNeRF: 学習済み事前知識なしにおける少数ショットの新しい視点合成のための高速収束

要旨

ニューラル・ラディアンス・フィールド（NeRF）は、少数ショットのシナリオにおいて、過学習と高品質なレンダリングのための長いトレーニング時間という重要な課題に直面しています。FreeNeRFやSparseNeRFなどの既存手法は、周波数正則化や事前学習された事前知識を使用していますが、複雑なスケジューリングやバイアスに苦しんでいます。本研究では、FrugalNeRFという新しい少数ショットNeRFフレームワークを導入しました。このフレームワークは、複数のスケールでウェイト共有ボクセルを活用して、シーンの詳細を効率的に表現します。主要な貢献は、クロススケールの幾何学的適応スキームであり、再投影誤差に基づいて擬似的な地面の深さを選択することで、トレーニングを導きます。これにより、外部で学習した事前知識に依存せず、トレーニングデータを十分に活用できます。また、事前学習された事前知識を統合することもでき、収束を遅らせることなく品質を向上させます。LLFF、DTU、RealEstate-10Kでの実験結果は、FrugalNeRFが他の少数ショットNeRF手法を凌駕し、トレーニング時間を大幅に短縮しながら、効率的かつ正確な3Dシーン再構築の実用的な解決策となることを示しています。

English

Neural Radiance Fields (NeRF) face significant challenges in few-shot scenarios, primarily due to overfitting and long training times for high-fidelity rendering. Existing methods, such as FreeNeRF and SparseNeRF, use frequency regularization or pre-trained priors but struggle with complex scheduling and bias. We introduce FrugalNeRF, a novel few-shot NeRF framework that leverages weight-sharing voxels across multiple scales to efficiently represent scene details. Our key contribution is a cross-scale geometric adaptation scheme that selects pseudo ground truth depth based on reprojection errors across scales. This guides training without relying on externally learned priors, enabling full utilization of the training data. It can also integrate pre-trained priors, enhancing quality without slowing convergence. Experiments on LLFF, DTU, and RealEstate-10K show that FrugalNeRF outperforms other few-shot NeRF methods while significantly reducing training time, making it a practical solution for efficient and accurate 3D scene reconstruction.

FrugalNeRF: 学習済み事前知識なしにおける少数ショットの新しい視点合成のための高速収束

FrugalNeRF: Fast Convergence for Few-shot Novel View Synthesis without Learned Priors

要旨

Support