ViewFusion:通過插值去噪實現多視圖一致性
ViewFusion: Towards Multi-View Consistency via Interpolated Denoising
February 29, 2024
作者: Xianghui Yang, Yan Zuo, Sameera Ramasinghe, Loris Bazzani, Gil Avraham, Anton van den Hengel
cs.AI
摘要
透過擴散模型進行新視角合成已展現出生成多樣且高質量圖像的顯著潛力。然而,在這些主流方法中,圖像生成的獨立過程導致在保持多視角一致性方面面臨挑戰。為解決此問題,我們引入了ViewFusion,一種新穎的、無需訓練的演算法,可以無縫地整合到現有預訓練的擴散模型中。我們的方法採用自回歸方法,隱式地利用先前生成的視角作為下一個視角生成的上下文,確保在新視角生成過程中具有強大的多視角一致性。通過通過插值去噪將已知視角信息融合的擴散過程,我們的框架成功地將單視角條件模型擴展到多視角條件設置中,而無需進行任何額外的微調。大量實驗結果證明了ViewFusion在生成一致且詳細的新視角方面的有效性。
English
Novel-view synthesis through diffusion models has demonstrated remarkable
potential for generating diverse and high-quality images. Yet, the independent
process of image generation in these prevailing methods leads to challenges in
maintaining multiple-view consistency. To address this, we introduce
ViewFusion, a novel, training-free algorithm that can be seamlessly integrated
into existing pre-trained diffusion models. Our approach adopts an
auto-regressive method that implicitly leverages previously generated views as
context for the next view generation, ensuring robust multi-view consistency
during the novel-view generation process. Through a diffusion process that
fuses known-view information via interpolated denoising, our framework
successfully extends single-view conditioned models to work in multiple-view
conditional settings without any additional fine-tuning. Extensive experimental
results demonstrate the effectiveness of ViewFusion in generating consistent
and detailed novel views.