HybridStitch: 확산 가속을 위한 픽셀 및 타임스텝 수준 모델 스티칭

초록

확산 모델은 텍스트-이미지(T2I) 생성 응용 분야에서 뛰어난 능력을 입증해왔습니다. 고도화된 생성 결과물에도 불구하고, 특히 수백억 개의 매개변수를 포함하는 대규모 모델의 경우 심각한 계산 오버헤드 문제를 안고 있습니다. 선행 연구에서는 잡음 제거 단계의 일부를 더 작은 모델로 대체하더라도 생성 품질이 유지된다는 것을 보여주었습니다. 그러나 이러한 방법들은 일부 시간 단계에서의 계산 절약에만 초점을 맞추고, 하나의 시간 단계 내에서도 존재하는 계산 수요의 차이를 간과했습니다. 본 연구에서는 생성을 편집 작업처럼 접근하는 새로운 T2I 생성 패러다임인 HybridStitch를 제안합니다. 구체적으로, 우리는 대규모 모델과 소규모 모델을 함께 통합하는 하이브리드 단계를 도입합니다. HybridStitch는 전체 이미지를 두 영역으로 분리합니다: 하나는 상대적으로 렌더링이 쉬워 더 작은 모델로의 조기 전환이 가능한 영역이고, 다른 하나는 더 복잡하여 대규모 모델의 정교화가 필요한 영역입니다. HybridStitch는 소규모 모델을 이용해 거친 스케치를 구성하는 동시에 대규모 모델을 활용해 복잡한 영역을 편집하고 다듬습니다. 평가에 따르면, HybridStitch는 Stable Diffusion 3에서 1.83배의 속도 향상을 달성하여 기존의 모든 모델 혼합 방법보다 빠른 성능을 보입니다.

English

Diffusion models have demonstrated a remarkable ability in Text-to-Image (T2I) generation applications. Despite the advanced generation output, they suffer from heavy computation overhead, especially for large models that contain tens of billions of parameters. Prior work has illustrated that replacing part of the denoising steps with a smaller model still maintains the generation quality. However, these methods only focus on saving computation for some timesteps, ignoring the difference in compute demand within one timestep. In this work, we propose HybridStitch, a new T2I generation paradigm that treats generation like editing. Specifically, we introduce a hybrid stage that jointly incorporates both the large model and the small model. HybridStitch separates the entire image into two regions: one that is relatively easy to render, enabling an early transition to the smaller model, and another that is more complex and therefore requires refinement by the large model. HybridStitch employs the small model to construct a coarse sketch while exploiting the large model to edit and refine the complex regions. According to our evaluation, HybridStitch achieves 1.83times speedup on Stable Diffusion 3, which is faster than all existing mixture of model methods.

HybridStitch: 확산 가속을 위한 픽셀 및 타임스텝 수준 모델 스티칭

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

초록

Support