ChatPaper.aiChatPaper

馴服潛在擴散模型,用於神經輻射場修補

Taming Latent Diffusion Model for Neural Radiance Field Inpainting

April 15, 2024
作者: Chieh Hubert Lin, Changil Kim, Jia-Bin Huang, Qinbo Li, Chih-Yao Ma, Johannes Kopf, Ming-Hsuan Yang, Hung-Yu Tseng
cs.AI

摘要

神經輻射場(NeRF)是一種從多視角影像進行三維重建的表示法。儘管最近一些工作展示了在擁有擴散先驗的重建 NeRF 上進行編輯取得了初步成功,但它們仍在努力合成完全未覆蓋區域合理幾何的困難。一個主要原因是從擴散模型產生的合成內容具有高度多樣性,這阻礙了輻射場收斂到清晰和確定性幾何。此外,將潛在擴散模型應用於真實數據往往會由於自編碼錯誤導致與圖像條件不一致的紋理偏移。這兩個問題進一步加劇了使用像素距離損失的情況。為了解決這些問題,我們提出通過每個場景的定制來調節擴散模型的隨機性,並通過遮罩對抗訓練來減輕紋理偏移。在分析過程中,我們還發現常用的像素損失和感知損失在 NeRF 修補任務中是有害的。通過嚴格的實驗,我們的框架在各種現實場景上實現了最先進的 NeRF 修補結果。項目頁面:https://hubert0527.github.io/MALD-NeRF
English
Neural Radiance Field (NeRF) is a representation for 3D reconstruction from multi-view images. Despite some recent work showing preliminary success in editing a reconstructed NeRF with diffusion prior, they remain struggling to synthesize reasonable geometry in completely uncovered regions. One major reason is the high diversity of synthetic contents from the diffusion model, which hinders the radiance field from converging to a crisp and deterministic geometry. Moreover, applying latent diffusion models on real data often yields a textural shift incoherent to the image condition due to auto-encoding errors. These two problems are further reinforced with the use of pixel-distance losses. To address these issues, we propose tempering the diffusion model's stochasticity with per-scene customization and mitigating the textural shift with masked adversarial training. During the analyses, we also found the commonly used pixel and perceptual losses are harmful in the NeRF inpainting task. Through rigorous experiments, our framework yields state-of-the-art NeRF inpainting results on various real-world scenes. Project page: https://hubert0527.github.io/MALD-NeRF

Summary

AI-Generated Summary

PDF70December 15, 2024