자기 지도 학습을 위한 디노이징 디퓨전 모델 해체

초록

본 연구에서는 원래 이미지 생성을 위해 개발된 디노이징 확산 모델(Denoising Diffusion Models, DDM)의 표현 학습 능력을 조사합니다. 우리의 철학은 DDM을 해체하여 점진적으로 고전적인 디노이징 오토인코더(Denoising Autoencoder, DAE)로 변환하는 것입니다. 이러한 해체적 접근을 통해 현대 DDM의 다양한 구성 요소가 자기 지도 표현 학습에 미치는 영향을 탐구할 수 있습니다. 우리는 좋은 표현을 학습하는 데 있어 현대적 구성 요소 중 극히 일부만이 중요하며, 나머지 다수는 불필요하다는 사실을 관찰했습니다. 본 연구는 궁극적으로 고전적인 DAE와 상당 부분 유사한 매우 단순화된 접근법에 도달했습니다. 우리는 이 연구가 현대 자기 지도 학습 영역 내에서 고전적 방법군에 대한 관심을 다시 불러일으키기를 바랍니다.

English

In this study, we examine the representation learning abilities of Denoising Diffusion Models (DDM) that were originally purposed for image generation. Our philosophy is to deconstruct a DDM, gradually transforming it into a classical Denoising Autoencoder (DAE). This deconstructive procedure allows us to explore how various components of modern DDMs influence self-supervised representation learning. We observe that only a very few modern components are critical for learning good representations, while many others are nonessential. Our study ultimately arrives at an approach that is highly simplified and to a large extent resembles a classical DAE. We hope our study will rekindle interest in a family of classical methods within the realm of modern self-supervised learning.

자기 지도 학습을 위한 디노이징 디퓨전 모델 해체

Deconstructing Denoising Diffusion Models for Self-Supervised Learning

초록

Support