롤링 확산 모델

초록

확산 모델은 최근 비디오, 유체 역학 시뮬레이션, 기후 데이터와 같은 시간적 데이터에 점점 더 많이 적용되고 있다. 이러한 방법들은 일반적으로 확산 과정에서의 노이즈 양에 대해 후속 프레임들을 동등하게 취급한다. 본 논문은 롤링 확산(Rolling Diffusion)이라는 새로운 접근 방식을 탐구한다. 이는 슬라이딩 윈도우 디노이징 프로세스를 사용하며, 시퀀스에서 나중에 나타나는 프레임에 더 많은 노이즈를 할당함으로써 시간이 지남에 따라 확산 과정이 점진적으로 손상되도록 보장한다. 이는 생성 과정이 전개됨에 따라 미래에 대한 불확실성이 더 커지는 것을 반영한다. 실험적으로, 시간적 역학이 복잡한 경우 롤링 확산이 표준 확산보다 우수함을 보여준다. 특히, 이 결과는 Kinetics-600 비디오 데이터셋을 사용한 비디오 예측 작업과 혼돈 유체 역학 예측 실험에서 입증되었다.

English

Diffusion models have recently been increasingly applied to temporal data such as video, fluid mechanics simulations, or climate data. These methods generally treat subsequent frames equally regarding the amount of noise in the diffusion process. This paper explores Rolling Diffusion: a new approach that uses a sliding window denoising process. It ensures that the diffusion process progressively corrupts through time by assigning more noise to frames that appear later in a sequence, reflecting greater uncertainty about the future as the generation process unfolds. Empirically, we show that when the temporal dynamics are complex, Rolling Diffusion is superior to standard diffusion. In particular, this result is demonstrated in a video prediction task using the Kinetics-600 video dataset and in a chaotic fluid dynamics forecasting experiment.

롤링 확산 모델

Rolling Diffusion Models

초록

Support