SIGNeRF: 신경 방사 필드를 위한 장면 통합 생성

초록

이미지 확산 모델의 발전은 최근 고품질 이미지 생성에서 주목할 만한 개선을 이끌어냈습니다. 신경 방사 필드(NeRF)와 결합하여, 이들은 3D 생성 분야에서 새로운 기회를 열었습니다. 그러나 대부분의 생성적 3D 접근법은 객체 중심적이며, 이를 기존의 사실적인 장면 편집에 적용하는 것은 간단하지 않습니다. 우리는 SIGNeRF를 제안합니다. 이는 빠르고 제어 가능한 NeRF 장면 편집 및 장면 통합 객체 생성을 위한 새로운 접근법입니다. 새로운 생성적 업데이트 전략은 반복적인 최적화 없이도 편집된 이미지 간의 3D 일관성을 보장합니다. 우리는 깊이 조건화된 확산 모델이 단일 뷰 대신 이미지 그리드를 요청함으로써 3D 일관된 뷰를 생성할 수 있는 능력을 내재하고 있음을 발견했습니다. 이러한 통찰을 바탕으로, 우리는 수정된 이미지의 다중 뷰 참조 시트를 도입합니다. 우리의 방법은 참조 시트를 기반으로 이미지 컬렉션을 일관되게 업데이트하고, 새로 생성된 이미지 세트로 원래의 NeRF를 한 번에 정제합니다. 이미지 확산 모델의 깊이 조건화 메커니즘을 활용함으로써, 우리는 편집의 공간적 위치에 대한 세밀한 제어를 얻고, 선택된 영역 또는 외부 메시에 의해 형상 가이드를 강제합니다.

English

Advances in image diffusion models have recently led to notable improvements in the generation of high-quality images. In combination with Neural Radiance Fields (NeRFs), they enabled new opportunities in 3D generation. However, most generative 3D approaches are object-centric and applying them to editing existing photorealistic scenes is not trivial. We propose SIGNeRF, a novel approach for fast and controllable NeRF scene editing and scene-integrated object generation. A new generative update strategy ensures 3D consistency across the edited images, without requiring iterative optimization. We find that depth-conditioned diffusion models inherently possess the capability to generate 3D consistent views by requesting a grid of images instead of single views. Based on these insights, we introduce a multi-view reference sheet of modified images. Our method updates an image collection consistently based on the reference sheet and refines the original NeRF with the newly generated image set in one go. By exploiting the depth conditioning mechanism of the image diffusion model, we gain fine control over the spatial location of the edit and enforce shape guidance by a selected region or an external mesh.

SIGNeRF: 신경 방사 필드를 위한 장면 통합 생성

SIGNeRF: Scene Integrated Generation for Neural Radiance Fields

초록

Support