ChatPaper.aiChatPaper

高保真新視角合成:基於濺射引導的擴散技術

High-Fidelity Novel View Synthesis via Splatting-Guided Diffusion

February 18, 2025
作者: Xiang Zhang, Yang Zhang, Lukas Mehl, Markus Gross, Christopher Schroers
cs.AI

摘要

儘管新視角合成(Novel View Synthesis, NVS)領域近期取得了進展,但從單一或稀疏觀測中生成高保真視圖仍是一大挑戰。現有的基於點雲渲染的方法常因渲染誤差而產生幾何失真。雖然基於擴散模型的方法利用豐富的三維先驗知識來改善幾何結構,但它們往往會出現紋理幻覺問題。本文提出SplatDiff,一種基於像素點雲引導的視頻擴散模型,旨在從單一圖像合成高保真的新視角。具體而言,我們提出了一種對齊合成策略,以精確控制目標視角並實現幾何一致的視圖合成。為減輕紋理幻覺,我們設計了一個紋理橋接模塊,通過自適應特徵融合實現高保真紋理生成。如此,SplatDiff結合了點雲渲染和擴散模型的優勢,生成具有一致幾何結構和高保真細節的新視角。大量實驗驗證了SplatDiff在單視圖NVS中的領先性能。此外,無需額外訓練,SplatDiff在多樣化任務中展現出卓越的零樣本性能,包括稀疏視圖NVS和立體視頻轉換。
English
Despite recent advances in Novel View Synthesis (NVS), generating high-fidelity views from single or sparse observations remains a significant challenge. Existing splatting-based approaches often produce distorted geometry due to splatting errors. While diffusion-based methods leverage rich 3D priors to achieve improved geometry, they often suffer from texture hallucination. In this paper, we introduce SplatDiff, a pixel-splatting-guided video diffusion model designed to synthesize high-fidelity novel views from a single image. Specifically, we propose an aligned synthesis strategy for precise control of target viewpoints and geometry-consistent view synthesis. To mitigate texture hallucination, we design a texture bridge module that enables high-fidelity texture generation through adaptive feature fusion. In this manner, SplatDiff leverages the strengths of splatting and diffusion to generate novel views with consistent geometry and high-fidelity details. Extensive experiments verify the state-of-the-art performance of SplatDiff in single-view NVS. Additionally, without extra training, SplatDiff shows remarkable zero-shot performance across diverse tasks, including sparse-view NVS and stereo video conversion.

Summary

AI-Generated Summary

PDF32February 20, 2025