ChatPaper.aiChatPaper

SDXS:具有图像条件的实时一步潜扩散模型

SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions

March 25, 2024
作者: Yuda Song, Zehao Sun, Xuanwu Yin
cs.AI

摘要

最近扩散模型的进展使其成为图像生成的前沿。尽管扩散模型性能优越,但也存在一些缺点;它们具有复杂的架构和大量的计算需求,导致由于迭代采样过程而产生显著的延迟。为了缓解这些限制,我们引入了一种双重方法,包括模型小型化和减少采样步骤,旨在显著降低模型延迟。我们的方法利用知识蒸馏来简化U-Net和图像解码器的架构,并引入一种创新的一步DM训练技术,利用特征匹配和分数蒸馏。我们提出了两个模型,SDXS-512和SDXS-1024,在单个GPU上分别实现了约100 FPS的推理速度(比SD v1.5快30倍)和30 FP的速度(比SDXL快60倍)。此外,我们的训练方法在图像条件控制方面具有很好的应用前景,有助于实现高效的图像到图像的转换。
English
Recent advancements in diffusion models have positioned them at the forefront of image generation. Despite their superior performance, diffusion models are not without drawbacks; they are characterized by complex architectures and substantial computational demands, resulting in significant latency due to their iterative sampling process. To mitigate these limitations, we introduce a dual approach involving model miniaturization and a reduction in sampling steps, aimed at significantly decreasing model latency. Our methodology leverages knowledge distillation to streamline the U-Net and image decoder architectures, and introduces an innovative one-step DM training technique that utilizes feature matching and score distillation. We present two models, SDXS-512 and SDXS-1024, achieving inference speeds of approximately 100 FPS (30x faster than SD v1.5) and 30 FP (60x faster than SDXL) on a single GPU, respectively. Moreover, our training approach offers promising applications in image-conditioned control, facilitating efficient image-to-image translation.

Summary

AI-Generated Summary

PDF213December 15, 2024