カスタマイズ可能なNeRF：ローカル-グローバル反復学習による適応的ソース駆動型3Dシーン編集

要旨

本論文では、適応的ソース駆動型3Dシーン編集タスクを対象とし、テキスト記述または参照画像を編集プロンプトとして統合するCustomNeRFモデルを提案する。しかし、編集プロンプトに適合した望ましい編集結果を得ることは容易ではなく、前景領域の正確な編集と単一視点の参照画像に基づく多視点一貫性という2つの重要な課題が存在する。最初の課題に対処するため、前景領域の編集と全体画像の編集を交互に行うLocal-Global Iterative Editing（LGIE）トレーニングスキームを提案し、背景を保持しながら前景のみを操作することを目指す。2つ目の課題に対しては、生成モデル内のクラス事前情報を活用して、画像駆動編集における異なる視点間の不整合問題を緩和するクラス誘導正則化を設計する。大規模な実験により、CustomNeRFがテキスト駆動および画像駆動の両設定において、様々な実世界シーンで正確な編集結果を生成することが示された。

English

In this paper, we target the adaptive source driven 3D scene editing task by proposing a CustomNeRF model that unifies a text description or a reference image as the editing prompt. However, obtaining desired editing results conformed with the editing prompt is nontrivial since there exist two significant challenges, including accurate editing of only foreground regions and multi-view consistency given a single-view reference image. To tackle the first challenge, we propose a Local-Global Iterative Editing (LGIE) training scheme that alternates between foreground region editing and full-image editing, aimed at foreground-only manipulation while preserving the background. For the second challenge, we also design a class-guided regularization that exploits class priors within the generation model to alleviate the inconsistency problem among different views in image-driven editing. Extensive experiments show that our CustomNeRF produces precise editing results under various real scenes for both text- and image-driven settings.

カスタマイズ可能なNeRF：ローカル-グローバル反復学習による適応的ソース駆動型3Dシーン編集

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

要旨

Support