ChatPaper.aiChatPaper

SwiftBrush v2:使您的一步擴散模型優於其老師

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher

August 26, 2024
作者: Trung Dao, Thuan Hoang Nguyen, Thanh Le, Duc Vu, Khoi Nguyen, Cuong Pham, Anh Tran
cs.AI

摘要

本文旨在增強SwiftBrush的性能,這是一個著名的一步式文本到圖像擴散模型,以使其能夠與多步式穩定擴散模型競爭。最初,我們探討了SwiftBrush和SD Turbo之間的質量-多樣性權衡:前者擅長於圖像多樣性,而後者擅長於圖像質量。這一觀察激發了我們對訓練方法的修改,包括更好的權重初始化和高效的LoRA訓練。此外,我們引入了一種新的夾制CLIP損失,增強了圖像和文本的對齊,並提高了圖像質量。值得注意的是,通過結合使用高效LoRA和完整訓練的模型權重,我們實現了一個新的最先進的一步式擴散模型,實現了8.14的FID,超越了所有基於GAN和多步穩定擴散模型。評估代碼可在以下鏈接找到:https://github.com/vinairesearch/swiftbrushv2。
English
In this paper, we aim to enhance the performance of SwiftBrush, a prominent one-step text-to-image diffusion model, to be competitive with its multi-step Stable Diffusion counterpart. Initially, we explore the quality-diversity trade-off between SwiftBrush and SD Turbo: the former excels in image diversity, while the latter excels in image quality. This observation motivates our proposed modifications in the training methodology, including better weight initialization and efficient LoRA training. Moreover, our introduction of a novel clamped CLIP loss enhances image-text alignment and results in improved image quality. Remarkably, by combining the weights of models trained with efficient LoRA and full training, we achieve a new state-of-the-art one-step diffusion model, achieving an FID of 8.14 and surpassing all GAN-based and multi-step Stable Diffusion models. The evaluation code is available at: https://github.com/vinairesearch/swiftbrushv2.

Summary

AI-Generated Summary

PDF636November 16, 2024