Hyper-SD:用於高效圖像合成的軌跡分段一致性模型
Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis
April 21, 2024
作者: Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan Xie, Xing Wang, Xuefeng Xiao
cs.AI
摘要
最近,一系列考慮擴散的蒸餾演算法已經出現,以減輕與擴散模型(DMs)多步驟推論過程相關的計算負擔。目前的蒸餾技術通常可以分為兩個不同的方面:i)ODE軌跡保留;和ii)ODE軌跡重塑。然而,這些方法存在嚴重的性能降級或領域轉移問題。為了解決這些限制,我們提出了Hyper-SD,一個新穎的框架,它協同地融合了ODE軌跡保留和重塑的優勢,同時在步驟壓縮過程中保持接近無損的性能。首先,我們引入了軌跡分段一致性蒸餾,逐步在預定義的時間步驟段內進行一致性蒸餾,有助於從更高層次的角度保留原始ODE軌跡。其次,我們結合人類反饋學習,以提升模型在低步驟範疇中的性能,並減輕蒸餾過程中所造成的性能損失。第三,我們整合分數蒸餾,進一步提高模型的低步驟生成能力,並首次嘗試利用統一的LoRA來支持所有步驟的推論過程。大量實驗和用戶研究表明,Hyper-SD在SDXL和SD1.5的1至8個推論步驟中均實現了SOTA性能。例如,Hyper-SDXL在1步驟推論中的CLIP分數和Aes分數分別比SDXL-Lightning高出+0.68和+0.51。
English
Recently, a series of diffusion-aware distillation algorithms have emerged to
alleviate the computational overhead associated with the multi-step inference
process of Diffusion Models (DMs). Current distillation techniques often
dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii)
ODE Trajectory Reformulation. However, these approaches suffer from severe
performance degradation or domain shifts. To address these limitations, we
propose Hyper-SD, a novel framework that synergistically amalgamates the
advantages of ODE Trajectory Preservation and Reformulation, while maintaining
near-lossless performance during step compression. Firstly, we introduce
Trajectory Segmented Consistency Distillation to progressively perform
consistent distillation within pre-defined time-step segments, which
facilitates the preservation of the original ODE trajectory from a higher-order
perspective. Secondly, we incorporate human feedback learning to boost the
performance of the model in a low-step regime and mitigate the performance loss
incurred by the distillation process. Thirdly, we integrate score distillation
to further improve the low-step generation capability of the model and offer
the first attempt to leverage a unified LoRA to support the inference process
at all steps. Extensive experiments and user studies demonstrate that Hyper-SD
achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5.
For example, Hyper-SDXL surpasses SDXL-Lightning by +0.68 in CLIP Score and
+0.51 in Aes Score in the 1-step inference.Summary
AI-Generated Summary