ChatPaper.aiChatPaper

SwiftBrush v2:使您的一步扩散模型优于其教师

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

August 26, 2024
作者: Trung Dao, Thuan Hoang Nguyen, Thanh Le, Duc Vu, Khoi Nguyen, Cuong Pham, Anh Tran
cs.AI

摘要

本文旨在提升SwiftBrush的性能,这是一种著名的一步式文本到图像扩散模型,使其能够与多步式稳定扩散模型相竞争。最初,我们探讨了SwiftBrush和SD Turbo之间的质量-多样性权衡:前者擅长图像多样性,而后者擅长图像质量。这一观察结果激发了我们在训练方法中提出的修改,包括更好的权重初始化和高效的LoRA训练。此外,我们引入了一种新颖的夹紧CLIP损失,增强了图像与文本的对齐,并提高了图像质量。值得注意的是,通过结合使用高效LoRA和完整训练的模型的权重,我们实现了一个新的最先进的一步式扩散模型,实现了8.14的FID,并超越了所有基于GAN和多步稳定扩散模型。评估代码可在以下链接找到:https://github.com/vinairesearch/swiftbrushv2。
English
In this paper, we aim to enhance the performance of SwiftBrush, a prominent one-step text-to-image diffusion model, to be competitive with its multi-step Stable Diffusion counterpart. Initially, we explore the quality-diversity trade-off between SwiftBrush and SD Turbo: the former excels in image diversity, while the latter excels in image quality. This observation motivates our proposed modifications in the training methodology, including better weight initialization and efficient LoRA training. Moreover, our introduction of a novel clamped CLIP loss enhances image-text alignment and results in improved image quality. Remarkably, by combining the weights of models trained with efficient LoRA and full training, we achieve a new state-of-the-art one-step diffusion model, achieving an FID of 8.14 and surpassing all GAN-based and multi-step Stable Diffusion models. The evaluation code is available at: https://github.com/vinairesearch/swiftbrushv2.

Summary

AI-Generated Summary

PDF636November 16, 2024