ChatPaper.aiChatPaper

Hyper-SD:用于高效图像合成的轨迹分段一致性模型

Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis

April 21, 2024
作者: Yuxi Ren, Xin Xia, Yanzuo Lu, Jiacheng Zhang, Jie Wu, Pan Xie, Xing Wang, Xuefeng Xiao
cs.AI

摘要

最近,一系列考虑扩散的蒸馏算法已经出现,旨在减轻与扩散模型(DMs)多步推理过程相关的计算开销。当前的蒸馏技术通常可分为两个不同方面:i)ODE轨迹保留;和ii)ODE轨迹重构。然而,这些方法存在严重的性能下降或领域转移问题。为了解决这些限制,我们提出了Hyper-SD,这是一个新颖的框架,协同地融合了ODE轨迹保留和重构的优势,同时在步骤压缩过程中保持接近无损性能。首先,我们引入了轨迹分段一致性蒸馏,逐步在预定义的时间步段内进行一致性蒸馏,有助于从更高阶的角度保留原始ODE轨迹。其次,我们结合人类反馈学习,提高了模型在低步骤范围内的性能,并减轻了蒸馏过程中造成的性能损失。第三,我们整合了得分蒸馏,进一步提高了模型的低步骤生成能力,并首次尝试利用统一的LoRA来支持所有步骤的推理过程。大量实验和用户研究表明,Hyper-SD在SDXL和SD1.5的1至8个推理步骤中均实现了SOTA性能。例如,Hyper-SDXL在1步推理中的CLIP得分和Aes得分分别比SDXL-Lightning高出+0.68和+0.51。
English
Recently, a series of diffusion-aware distillation algorithms have emerged to alleviate the computational overhead associated with the multi-step inference process of Diffusion Models (DMs). Current distillation techniques often dichotomize into two distinct aspects: i) ODE Trajectory Preservation; and ii) ODE Trajectory Reformulation. However, these approaches suffer from severe performance degradation or domain shifts. To address these limitations, we propose Hyper-SD, a novel framework that synergistically amalgamates the advantages of ODE Trajectory Preservation and Reformulation, while maintaining near-lossless performance during step compression. Firstly, we introduce Trajectory Segmented Consistency Distillation to progressively perform consistent distillation within pre-defined time-step segments, which facilitates the preservation of the original ODE trajectory from a higher-order perspective. Secondly, we incorporate human feedback learning to boost the performance of the model in a low-step regime and mitigate the performance loss incurred by the distillation process. Thirdly, we integrate score distillation to further improve the low-step generation capability of the model and offer the first attempt to leverage a unified LoRA to support the inference process at all steps. Extensive experiments and user studies demonstrate that Hyper-SD achieves SOTA performance from 1 to 8 inference steps for both SDXL and SD1.5. For example, Hyper-SDXL surpasses SDXL-Lightning by +0.68 in CLIP Score and +0.51 in Aes Score in the 1-step inference.

Summary

AI-Generated Summary

PDF292December 15, 2024