ViewFusion：通過插值去噪實現多視圖一致性

摘要

透過擴散模型進行新視角合成已展現出生成多樣且高質量圖像的顯著潛力。然而，在這些主流方法中，圖像生成的獨立過程導致在保持多視角一致性方面面臨挑戰。為解決此問題，我們引入了ViewFusion，一種新穎的、無需訓練的演算法，可以無縫地整合到現有預訓練的擴散模型中。我們的方法採用自回歸方法，隱式地利用先前生成的視角作為下一個視角生成的上下文，確保在新視角生成過程中具有強大的多視角一致性。通過通過插值去噪將已知視角信息融合的擴散過程，我們的框架成功地將單視角條件模型擴展到多視角條件設置中，而無需進行任何額外的微調。大量實驗結果證明了ViewFusion在生成一致且詳細的新視角方面的有效性。

English

Novel-view synthesis through diffusion models has demonstrated remarkable potential for generating diverse and high-quality images. Yet, the independent process of image generation in these prevailing methods leads to challenges in maintaining multiple-view consistency. To address this, we introduce ViewFusion, a novel, training-free algorithm that can be seamlessly integrated into existing pre-trained diffusion models. Our approach adopts an auto-regressive method that implicitly leverages previously generated views as context for the next view generation, ensuring robust multi-view consistency during the novel-view generation process. Through a diffusion process that fuses known-view information via interpolated denoising, our framework successfully extends single-view conditioned models to work in multiple-view conditional settings without any additional fine-tuning. Extensive experimental results demonstrate the effectiveness of ViewFusion in generating consistent and detailed novel views.

ViewFusion：通過插值去噪實現多視圖一致性

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

摘要

Support