ChatPaper.aiChatPaper

通過前向模型擴散:解決無需直接監督的隨機反問題

Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision

June 20, 2023
作者: Ayush Tewari, Tianwei Yin, George Cazenavette, Semon Rezchikov, Joshua B. Tenenbaum, Frédo Durand, William T. Freeman, Vincent Sitzmann
cs.AI

摘要

去噪擴散模型是一種強大的生成模型,用於捕捉現實世界信號的複雜分佈。然而,它們的應用僅限於訓練樣本readily可用的情況,而這在現實應用中並非總是如此。例如,在逆向圖形學中,目標是從與給定圖像對齊的3D場景分佈生成樣本,但缺乏地實3D場景,只能訪問2D圖像。為了解決這一限制,我們提出了一種新型的去噪擴散概率模型,該模型學習從從未直接觀察到的信號分佈中抽樣。相反,這些信號通過已知的可微分正向模型間接測量,該模型生成未知信號的部分觀測。我們的方法涉及將正向模型直接整合到去噪過程中。這種整合有效地將觀測的生成建模與底層信號的生成建模相連接,從而實現對信號的條件生成模型的端到端訓練。在推斷過程中,我們的方法使從與給定部分觀測一致的底層信號分佈中抽樣成為可能。我們在三個具有挑戰性的計算機視覺任務上展示了我們方法的有效性。例如,在逆向圖形學的背景下,我們的模型實現了從與單個2D輸入圖像對齊的3D場景分佈直接抽樣。
English
Denoising diffusion models are a powerful type of generative models used to capture complex distributions of real-world signals. However, their applicability is limited to scenarios where training samples are readily available, which is not always the case in real-world applications. For example, in inverse graphics, the goal is to generate samples from a distribution of 3D scenes that align with a given image, but ground-truth 3D scenes are unavailable and only 2D images are accessible. To address this limitation, we propose a novel class of denoising diffusion probabilistic models that learn to sample from distributions of signals that are never directly observed. Instead, these signals are measured indirectly through a known differentiable forward model, which produces partial observations of the unknown signal. Our approach involves integrating the forward model directly into the denoising process. This integration effectively connects the generative modeling of observations with the generative modeling of the underlying signals, allowing for end-to-end training of a conditional generative model over signals. During inference, our approach enables sampling from the distribution of underlying signals that are consistent with a given partial observation. We demonstrate the effectiveness of our method on three challenging computer vision tasks. For instance, in the context of inverse graphics, our model enables direct sampling from the distribution of 3D scenes that align with a single 2D input image.
PDF71December 15, 2024