캐시가 가능하다면 캐시하라: 블록 캐싱을 통한 확산 모델 가속화

초록

확산 모델(Diffusion Model)은 최근 사실적인 이미지를 생성할 수 있는 능력 덕분에 이미지 합성 분야를 혁신적으로 변화시켰습니다. 그러나 확산 모델의 주요 단점 중 하나는 이미지 생성 과정이 비용이 많이 든다는 점입니다. 무작위 노이즈에서 이미지를 반복적으로 정제하기 위해 대규모 이미지 대 이미지 네트워크를 여러 번 적용해야 합니다. 최근 많은 연구에서 필요한 단계 수를 줄이는 기술을 제안했지만, 이들은 일반적으로 기본 노이즈 제거 네트워크를 블랙박스로 취급합니다. 본 연구에서는 네트워크 내부의 레이어 동작을 조사하여 다음과 같은 사실을 발견했습니다: 1) 레이어의 출력이 시간에 따라 부드럽게 변화하며, 2) 레이어마다 뚜렷한 변화 패턴을 보이고, 3) 단계 간 변화가 매우 작은 경우가 많습니다. 우리는 노이즈 제거 네트워크의 많은 레이어 계산이 중복될 가능성이 있다고 가정합니다. 이를 활용하여, 이전 단계의 레이어 블록 출력을 재사용하여 추론 속도를 높이는 블록 캐싱(Block Caching) 기법을 제안합니다. 또한, 각 블록의 시간 단계별 변화를 기반으로 캐싱 일정을 자동으로 결정하는 기술을 제안합니다. 실험을 통해 FID, 인간 평가 및 정성적 분석을 통해 블록 캐싱이 동일한 계산 비용으로 더 높은 시각적 품질의 이미지를 생성할 수 있음을 보여줍니다. 이를 최신 모델(LDM 및 EMU)과 솔버(DDIM 및 DPM)에 대해 입증합니다.

English

Diffusion models have recently revolutionized the field of image synthesis due to their ability to generate photorealistic images. However, one of the major drawbacks of diffusion models is that the image generation process is costly. A large image-to-image network has to be applied many times to iteratively refine an image from random noise. While many recent works propose techniques to reduce the number of required steps, they generally treat the underlying denoising network as a black box. In this work, we investigate the behavior of the layers within the network and find that 1) the layers' output changes smoothly over time, 2) the layers show distinct patterns of change, and 3) the change from step to step is often very small. We hypothesize that many layer computations in the denoising network are redundant. Leveraging this, we introduce block caching, in which we reuse outputs from layer blocks of previous steps to speed up inference. Furthermore, we propose a technique to automatically determine caching schedules based on each block's changes over timesteps. In our experiments, we show through FID, human evaluation and qualitative analysis that Block Caching allows to generate images with higher visual quality at the same computational cost. We demonstrate this for different state-of-the-art models (LDM and EMU) and solvers (DDIM and DPM).

캐시가 가능하다면 캐시하라: 블록 캐싱을 통한 확산 모델 가속화

Cache Me if You Can: Accelerating Diffusion Models through Block Caching

초록

Support