NeRFiller：通过生成式3D修补完成场景

摘要

我们提出了NeRFiller，这是一种通过使用现成的2D视觉生成模型进行生成式3D修补来完成3D捕捉中缺失部分的方法。通常由于网格重建失败或缺乏观察（例如接触区域，如物体底部，或难以到达的区域），捕捉到的3D场景或对象的部分可能会缺失。我们通过利用2D修补扩散模型来解决这一具有挑战性的3D修补问题。我们发现这些模型存在一个令人惊讶的行为，即当图像形成2x2网格时，它们会生成更具3D一致性的修补，并展示如何将这种行为推广到超过四个图像。然后，我们提出了一个迭代框架，将这些修补区域提炼成一个一致的3D场景。与相关作品相比，我们专注于完成场景而不是删除前景对象，我们的方法不需要紧密的2D对象蒙版或文本。我们将我们的方法与适应我们设置的相关基准进行了比较，NeRFiller在各种场景上创建了最具3D一致性和可信度的场景完成。我们的项目页面位于https://ethanweber.me/nerfiller。

English

We propose NeRFiller, an approach that completes missing portions of a 3D capture via generative 3D inpainting using off-the-shelf 2D visual generative models. Often parts of a captured 3D scene or object are missing due to mesh reconstruction failures or a lack of observations (e.g., contact regions, such as the bottom of objects, or hard-to-reach areas). We approach this challenging 3D inpainting problem by leveraging a 2D inpainting diffusion model. We identify a surprising behavior of these models, where they generate more 3D consistent inpaints when images form a 2times2 grid, and show how to generalize this behavior to more than four images. We then present an iterative framework to distill these inpainted regions into a single consistent 3D scene. In contrast to related works, we focus on completing scenes rather than deleting foreground objects, and our approach does not require tight 2D object masks or text. We compare our approach to relevant baselines adapted to our setting on a variety of scenes, where NeRFiller creates the most 3D consistent and plausible scene completions. Our project page is at https://ethanweber.me/nerfiller.

NeRFiller：通过生成式3D修补完成场景

NeRFiller: Completing Scenes via Generative 3D Inpainting

摘要

Support