InseRF:神经3D场景中基于文本驱动的生成物体插入
InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes
January 10, 2024
作者: Mohamad Shahbazi, Liesbeth Claessens, Michael Niemeyer, Edo Collins, Alessio Tonioni, Luc Van Gool, Federico Tombari
cs.AI
摘要
我们介绍了一种名为InseRF的新方法,用于在3D场景的NeRF重建中生成对象插入。基于用户提供的文本描述和参考视点中的2D边界框,InseRF在3D场景中生成新对象。最近,由于在3D生成建模中使用了文本到图像扩散模型的强先验知识,3D场景编辑方法发生了深刻变革。现有方法在通过样式和外观变化或移除现有对象编辑3D场景方面大多效果显著。然而,对于这些方法而言,生成新对象仍然是一个挑战,我们在本研究中解决了这个问题。具体而言,我们建议将3D对象插入基于场景参考视图中的2D对象插入。然后,通过单视图对象重建方法将2D编辑转换为3D。重建的对象然后被插入到场景中,并受到单目深度估计方法的先验知识指导。我们在各种3D场景上评估了我们的方法,并对所提出的组件进行了深入分析。我们在几个3D场景中进行的对象生成插入实验表明,与现有方法相比,InseRF的效果更好。InseRF能够进行可控且3D一致的对象插入,而无需作为输入的显式3D信息。请访问我们的项目页面:https://mohamad-shahbazi.github.io/inserf。
English
We introduce InseRF, a novel method for generative object insertion in the
NeRF reconstructions of 3D scenes. Based on a user-provided textual description
and a 2D bounding box in a reference viewpoint, InseRF generates new objects in
3D scenes. Recently, methods for 3D scene editing have been profoundly
transformed, owing to the use of strong priors of text-to-image diffusion
models in 3D generative modeling. Existing methods are mostly effective in
editing 3D scenes via style and appearance changes or removing existing
objects. Generating new objects, however, remains a challenge for such methods,
which we address in this study. Specifically, we propose grounding the 3D
object insertion to a 2D object insertion in a reference view of the scene. The
2D edit is then lifted to 3D using a single-view object reconstruction method.
The reconstructed object is then inserted into the scene, guided by the priors
of monocular depth estimation methods. We evaluate our method on various 3D
scenes and provide an in-depth analysis of the proposed components. Our
experiments with generative insertion of objects in several 3D scenes indicate
the effectiveness of our method compared to the existing methods. InseRF is
capable of controllable and 3D-consistent object insertion without requiring
explicit 3D information as input. Please visit our project page at
https://mohamad-shahbazi.github.io/inserf.