ZeroNVS:從單張真實圖像合成零角度360度視角
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
October 27, 2023
作者: Kyle Sargent, Zizhang Li, Tanmay Shah, Charles Herrmann, Hong-Xing Yu, Yunzhi Zhang, Eric Ryan Chan, Dmitry Lagun, Li Fei-Fei, Deqing Sun, Jiajun Wu
cs.AI
摘要
我們引入了一個三維感知擴散模型 ZeroNVS,用於野外場景中的單圖像新視角合成。儘管現有方法是針對帶有遮罩背景的單個物體設計的,我們提出了新的技術來應對野外多物體場景和複雜背景引入的挑戰。具體而言,我們在捕獲以物體為中心、室內和室外場景的數據來源混合物上訓練一個生成先驗。為了應對來自數據混合的問題,如深度尺度模糊,我們提出了一種新的相機條件化參數化和歸一化方案。此外,我們觀察到得分蒸餾抽樣(SDS)在蒸餾 360 度場景時往往會截斷複雜背景的分佈,並提出了“SDS 錨定”以改善合成新視角的多樣性。我們的模型在 DTU 數據集的 LPIPS 上取得了新的最先進結果,即使是在零樣本設置下,也優於專門訓練於 DTU 的方法。我們進一步將具有挑戰性的 Mip-NeRF 360 數據集改編為新的單圖像新視角合成基準,並在此設置中展示出強大的性能。我們的代碼和數據位於 http://kylesargent.github.io/zeronvs/
English
We introduce a 3D-aware diffusion model, ZeroNVS, for single-image novel view
synthesis for in-the-wild scenes. While existing methods are designed for
single objects with masked backgrounds, we propose new techniques to address
challenges introduced by in-the-wild multi-object scenes with complex
backgrounds. Specifically, we train a generative prior on a mixture of data
sources that capture object-centric, indoor, and outdoor scenes. To address
issues from data mixture such as depth-scale ambiguity, we propose a novel
camera conditioning parameterization and normalization scheme. Further, we
observe that Score Distillation Sampling (SDS) tends to truncate the
distribution of complex backgrounds during distillation of 360-degree scenes,
and propose "SDS anchoring" to improve the diversity of synthesized novel
views. Our model sets a new state-of-the-art result in LPIPS on the DTU dataset
in the zero-shot setting, even outperforming methods specifically trained on
DTU. We further adapt the challenging Mip-NeRF 360 dataset as a new benchmark
for single-image novel view synthesis, and demonstrate strong performance in
this setting. Our code and data are at http://kylesargent.github.io/zeronvs/