ChatPaper.aiChatPaper

使用前向模型进行扩散:解决无需直接监督的随机逆问题

Diffusion with Forward Models: Solving Stochastic Inverse Problems Without Direct Supervision

June 20, 2023
作者: Ayush Tewari, Tianwei Yin, George Cazenavette, Semon Rezchikov, Joshua B. Tenenbaum, Frédo Durand, William T. Freeman, Vincent Sitzmann
cs.AI

摘要

去噪扩散模型是一种强大的生成模型,用于捕捉真实世界信号的复杂分布。然而,它们的适用性仅限于训练样本容易获得的情况,而这并非在实际应用中总是成立。例如,在逆向图形学中,目标是从与给定图像对齐的3D场景分布中生成样本,但无法获得地面真实的3D场景,只能访问2D图像。为了解决这一限制,我们提出了一种新颖的去噪扩散概率模型类,它学习从从未直接观察到的信号分布中抽样。相反,这些信号通过已知的可微分前向模型间接测量,该模型生成未知信号的部分观测。我们的方法涉及将前向模型直接整合到去噪过程中。这种整合有效地连接了对观测的生成建模与对基础信号的生成建模,实现了对信号的条件生成模型的端到端训练。在推断过程中,我们的方法使得能够从与给定部分观测一致的基础信号分布中进行抽样。我们在三个具有挑战性的计算机视觉任务上展示了我们方法的有效性。例如,在逆向图形学的背景下,我们的模型使得能够直接从与单个2D输入图像对齐的3D场景分布中进行抽样。
English
Denoising diffusion models are a powerful type of generative models used to capture complex distributions of real-world signals. However, their applicability is limited to scenarios where training samples are readily available, which is not always the case in real-world applications. For example, in inverse graphics, the goal is to generate samples from a distribution of 3D scenes that align with a given image, but ground-truth 3D scenes are unavailable and only 2D images are accessible. To address this limitation, we propose a novel class of denoising diffusion probabilistic models that learn to sample from distributions of signals that are never directly observed. Instead, these signals are measured indirectly through a known differentiable forward model, which produces partial observations of the unknown signal. Our approach involves integrating the forward model directly into the denoising process. This integration effectively connects the generative modeling of observations with the generative modeling of the underlying signals, allowing for end-to-end training of a conditional generative model over signals. During inference, our approach enables sampling from the distribution of underlying signals that are consistent with a given partial observation. We demonstrate the effectiveness of our method on three challenging computer vision tasks. For instance, in the context of inverse graphics, our model enables direct sampling from the distribution of 3D scenes that align with a single 2D input image.
PDF71December 15, 2024