LCM-LoRA：一种通用的稳定扩散加速模块

摘要

潜在一致性模型（LCMs）在加速文本到图像生成任务中取得了令人瞩目的表现，能够在最少的推理步骤中生成高质量图像。LCMs源自预训练的潜在扩散模型（LDMs），仅需约32个A100 GPU训练小时。本报告进一步扩展了LCMs的潜力，主要体现在两个方面：首先，通过将LoRA蒸馏应用于包括SD-V1.5、SSD-1B和SDXL在内的稳定扩散模型，我们将LCM的范围扩展到更大的模型，且内存消耗显著减少，实现了更优秀的图像生成质量。其次，我们将通过LCM蒸馏获得的LoRA参数识别为通用稳定扩散加速模块，命名为LCM-LoRA。LCM-LoRA可以直接插入各种稳定扩散微调模型或LoRAs，无需训练，因此代表了适用于多样图像生成任务的通用加速器。与先前的数值PF-ODE求解器（如DDIM、DPM-Solver）相比，LCM-LoRA可以被视为一种插件神经PF-ODE求解器，具有强大的泛化能力。项目页面：https://github.com/luosiallen/latent-consistency-model。

English

Latent Consistency Models (LCMs) have achieved impressive performance in accelerating text-to-image generative tasks, producing high-quality images with minimal inference steps. LCMs are distilled from pre-trained latent diffusion models (LDMs), requiring only ~32 A100 GPU training hours. This report further extends LCMs' potential in two aspects: First, by applying LoRA distillation to Stable-Diffusion models including SD-V1.5, SSD-1B, and SDXL, we have expanded LCM's scope to larger models with significantly less memory consumption, achieving superior image generation quality. Second, we identify the LoRA parameters obtained through LCM distillation as a universal Stable-Diffusion acceleration module, named LCM-LoRA. LCM-LoRA can be directly plugged into various Stable-Diffusion fine-tuned models or LoRAs without training, thus representing a universally applicable accelerator for diverse image generation tasks. Compared with previous numerical PF-ODE solvers such as DDIM, DPM-Solver, LCM-LoRA can be viewed as a plug-in neural PF-ODE solver that possesses strong generalization abilities. Project page: https://github.com/luosiallen/latent-consistency-model.

LCM-LoRA：一种通用的稳定扩散加速模块

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

摘要

Support