랑게빈 관점에서 재해석하는 확산 모델

초록

확산 모델은 종종 VAE, 스코어 매칭, 플로우 매칭 등 다양한 관점에서 소개되며, 초심자가 이해하기 어려운 복잡하고 기술적으로 까다로운 수학적 내용을 동반합니다. '역과정이 순과정을 어떻게 역전시켜 순수 노이즈로부터 데이터를 생성하는가?'는 대표적인 질문입니다. 본 논문은 확산 모델을 새로운 랑주뱅 관점에서 체계적으로 정리하며, 더 단순하고 명확하며 직관적인 답을 제시합니다. 또한 다음과 같은 의문도 다룹니다: ODE 기반과 SDE 기반 확산 모델을 어떻게 하나의 프레임워크 아래 통합할 수 있는가? 확산 모델이 왜 이론적으로 일반 VAE보다 우수한가? 플로우 매칭이 디노이징이나 스코어 매칭보다 근본적으로 단순하지 않으면서도 최대 우도 하에 동등한 이유는 무엇인가? 우리는 랑주뱅 관점이 이러한 질문들에 대해 명확하고 직관적인 답을 제공하며, 기존의 확산 모델 해석을 연결하고 서로 다른 공식들이 공통 프레임워크 내에서 어떻게 상호 변환되는지 보여줌으로써 학습자와 경험 많은 연구자 모두에게 교육적인 통찰을 제공함을 입증합니다.

English

Diffusion models are often introduced from multiple perspectives, such as VAEs, score matching, or flow matching, accompanied by dense and technically demanding mathematics that can be difficult for beginners to grasp. One classic question is: how does the reverse process invert the forward process to generate data from pure noise? This article systematically organizes the diffusion model from a fresh Langevin perspective, offering a simpler, clearer, and more intuitive answer. We also address the following questions: how can ODE-based and SDE-based diffusion models be unified under a single framework? Why are diffusion models theoretically superior to ordinary VAEs? Why is flow matching not fundamentally simpler than denoising or score matching, but equivalent under maximum-likelihood? We demonstrate that the Langevin perspective offers clear and straightforward answers to these questions, bridging existing interpretations of diffusion models, showing how different formulations can be converted into one another within a common framework, and offering pedagogical value for both learners and experienced researchers seeking deeper intuition.

랑게빈 관점에서 재해석하는 확산 모델

Rethinking the Diffusion Model from a Langevin Perspective

초록

Support