使用擴散模型解決反問題的變分觀點

摘要

擴散模型已成為視覺領域基礎模型的重要支柱之一。其中一個關鍵應用是通過單一擴散先驗普遍解決不同下游反向任務，而無需為每個任務重新訓練。大多數反向任務可以被定義為在給定測量（例如，遮罩圖像）的情況下推斷出關於數據（例如，完整圖像）的後驗分佈。然而，在擴散模型中這是具有挑戰性的，因為擴散過程的非線性和迭代性質使後驗難以處理。為了應對這一挑戰，我們提出了一種變分方法，通過設計來近似真實後驗分佈。我們展示了我們的方法自然地導致通過去噪擴散過程（RED-Diff）實現正則化，其中不同時間步的去噪器同時對圖像施加不同的結構約束。為了衡量來自不同時間步的去噪器的貢獻，我們提出了一種基於信噪比（SNR）的加權機制。我們的方法為使用擴散模型解決反向問題提供了一個新的變分觀點，使我們能夠將抽樣定義為隨機優化，從而可以簡單應用輕量級迭代的現成求解器。我們針對圖像修復任務，如修補和超分辨率，進行的實驗顯示了我們的方法相對於最先進的基於抽樣的擴散模型的優勢。

English

Diffusion models have emerged as a key pillar of foundation models in visual domains. One of their critical applications is to universally solve different downstream inverse tasks via a single diffusion prior without re-training for each task. Most inverse tasks can be formulated as inferring a posterior distribution over data (e.g., a full image) given a measurement (e.g., a masked image). This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. To cope with this challenge, we propose a variational approach that by design seeks to approximate the true posterior distribution. We show that our approach naturally leads to regularization by denoising diffusion process (RED-Diff) where denoisers at different timesteps concurrently impose different structural constraints over the image. To gauge the contribution of denoisers from different timesteps, we propose a weighting mechanism based on signal-to-noise-ratio (SNR). Our approach provides a new variational perspective for solving inverse problems with diffusion models, allowing us to formulate sampling as stochastic optimization, where one can simply apply off-the-shelf solvers with lightweight iterates. Our experiments for image restoration tasks such as inpainting and superresolution demonstrate the strengths of our method compared with state-of-the-art sampling-based diffusion models.

使用擴散模型解決反問題的變分觀點

A Variational Perspective on Solving Inverse Problems with Diffusion Models

摘要

Support