DistriFusion: 고해상도 디퓨전 모델을 위한 분산 병렬 추론

초록

디퓨전 모델은 고품질 이미지 합성에서 큰 성공을 거두었습니다. 그러나 디퓨전 모델을 사용하여 고해상도 이미지를 생성하는 것은 여전히 막대한 계산 비용으로 인해 어려운 과제이며, 이는 인터랙티브 애플리케이션에서 사용하기에는 지연 시간이 너무 길게 만듭니다. 본 논문에서는 이 문제를 해결하기 위해 다중 GPU 간의 병렬 처리를 활용한 DistriFusion을 제안합니다. 우리의 방법은 모델 입력을 여러 패치로 분할하고 각 패치를 GPU에 할당합니다. 그러나 이러한 알고리즘을 단순히 구현하면 패치 간의 상호작용이 끊어져 충실도가 떨어지고, 이러한 상호작용을 포함시키면 엄청난 통신 오버헤드가 발생합니다. 이러한 딜레마를 극복하기 위해, 우리는 인접한 디퓨전 단계 간의 입력이 매우 유사하다는 점을 관찰하고, 이전 타임스텝에서 미리 계산된 특징 맵을 재사용하여 현재 단계에 컨텍스트를 제공하는 displaced patch parallelism을 제안합니다. 따라서 우리의 방법은 비동기 통신을 지원하며, 이를 계산과 파이프라인으로 처리할 수 있습니다. 광범위한 실험을 통해 우리의 방법이 최신 Stable Diffusion XL에 적용 가능하며 품질 저하 없이 8개의 NVIDIA A100에서 단일 GPU 대비 최대 6.1배의 속도 향상을 달성할 수 있음을 보여줍니다. 우리의 코드는 https://github.com/mit-han-lab/distrifuser에서 공개되어 있습니다.

English

Diffusion models have achieved great success in synthesizing high-quality images. However, generating high-resolution images with diffusion models is still challenging due to the enormous computational costs, resulting in a prohibitive latency for interactive applications. In this paper, we propose DistriFusion to tackle this problem by leveraging parallelism across multiple GPUs. Our method splits the model input into multiple patches and assigns each patch to a GPU. However, na\"{\i}vely implementing such an algorithm breaks the interaction between patches and loses fidelity, while incorporating such an interaction will incur tremendous communication overhead. To overcome this dilemma, we observe the high similarity between the input from adjacent diffusion steps and propose displaced patch parallelism, which takes advantage of the sequential nature of the diffusion process by reusing the pre-computed feature maps from the previous timestep to provide context for the current step. Therefore, our method supports asynchronous communication, which can be pipelined by computation. Extensive experiments show that our method can be applied to recent Stable Diffusion XL with no quality degradation and achieve up to a 6.1times speedup on eight NVIDIA A100s compared to one. Our code is publicly available at https://github.com/mit-han-lab/distrifuser.

DistriFusion: 고해상도 디퓨전 모델을 위한 분산 병렬 추론

DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models

초록

Support