拆解去噪擴散模型以進行自監督學習

摘要

在這項研究中，我們探討了最初用於圖像生成的去噪擴散模型（DDM）的表示學習能力。我們的理念是將 DDM 解構，逐步轉變為經典的去噪自編碼器（DAE）。這種解構性程序使我們能夠探索現代 DDM 的各個組件如何影響自監督表示學習。我們觀察到，只有很少數的現代組件對於學習良好的表示是至關重要的，而許多其他組件則是非必要的。我們的研究最終提出了一種高度簡化且在很大程度上類似於經典 DAE 的方法。我們希望我們的研究能重新引起人們對現代自監督學習領域內一系列經典方法的興趣。

English

In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.

拆解去噪擴散模型以進行自監督學習

Deconstructing Denoising Diffusion Models for Self-Supervised Learning

摘要

Support