SIGNeRF：用於神經輻射場的場景整合生成

摘要

最近在影像擴散模型方面的進展已顯著改善高質量影像的生成。結合神經輻射場（NeRFs），它們為3D生成帶來了新機遇。然而，大多數生成式3D方法都以物件為中心，將它們應用於編輯現有的照片逼真場景並不簡單。我們提出了SIGNeRF，一種新穎的方法，用於快速且可控的NeRF場景編輯和場景整合物件生成。一種新的生成式更新策略確保了編輯後影像的3D一致性，而無需迭代優化。我們發現，基於深度條件的擴散模型本質上具有通過請求圖像網格而不是單個視圖來生成3D一致視圖的能力。基於這些見解，我們引入了一個修改後影像的多視圖參考表。我們的方法根據參考表一致地更新影像集合，並在一次操作中通過新生成的影像集合來完善原始NeRF。通過利用影像擴散模型的深度條件機制，我們可以對編輯的空間位置進行精細控制，並通過選定區域或外部網格來強制形狀引導。

English

Advances in image diffusion models have recently led to notable improvements in the generation of high-quality images. In combination with Neural Radiance Fields (NeRFs), they enabled new opportunities in 3D generation. However, most generative 3D approaches are object-centric and applying them to editing existing photorealistic scenes is not trivial. We propose SIGNeRF, a novel approach for fast and controllable NeRF scene editing and scene-integrated object generation. A new generative update strategy ensures 3D consistency across the edited images, without requiring iterative optimization. We find that depth-conditioned diffusion models inherently possess the capability to generate 3D consistent views by requesting a grid of images instead of single views. Based on these insights, we introduce a multi-view reference sheet of modified images. Our method updates an image collection consistently based on the reference sheet and refines the original NeRF with the newly generated image set in one go. By exploiting the depth conditioning mechanism of the image diffusion model, we gain fine control over the spatial location of the edit and enforce shape guidance by a selected region or an external mesh.

SIGNeRF：用於神經輻射場的場景整合生成

SIGNeRF: Scene Integrated Generation for Neural Radiance Fields

摘要

Support