UniFL：通過統一反饋學習來改善穩定擴散

摘要

擴散模型已經在影像生成領域引起了革命，帶來了高質量模型和多樣化的下游應用的激增。然而，儘管取得了顯著進展，目前競爭性解決方案仍然存在一些限制，包括視覺質量較差、缺乏美感以及推理效率低下，並且尚無全面解決方案。為了應對這些挑戰，我們提出了UniFL，這是一個利用反饋學習全面增強擴散模型的統一框架。UniFL以其通用、有效和可推廣的特點脫穎而出，適用於各種擴散模型，如SD1.5和SDXL。值得注意的是，UniFL包含三個關鍵組件：知覺反饋學習，用於增強視覺質量；解耦反饋學習，用於提高美感；對抗反饋學習，用於優化推理速度。深入的實驗和廣泛的用戶研究驗證了我們提出的方法在提升生成模型質量和加速方面的卓越性能。例如，UniFL在生成質量方面超越了ImageReward 17%的用戶偏好，並在4步推理中分別比LCM和SDXL Turbo高出57%和20%。此外，我們已經驗證了我們方法在下游任務中的有效性，包括Lora、ControlNet和AnimateDiff。

English

Diffusion models have revolutionized the field of image generation, leading to the proliferation of high-quality models and diverse downstream applications. However, despite these significant advancements, the current competitive solutions still suffer from several limitations, including inferior visual quality, a lack of aesthetic appeal, and inefficient inference, without a comprehensive solution in sight. To address these challenges, we present UniFL, a unified framework that leverages feedback learning to enhance diffusion models comprehensively. UniFL stands out as a universal, effective, and generalizable solution applicable to various diffusion models, such as SD1.5 and SDXL. Notably, UniFL incorporates three key components: perceptual feedback learning, which enhances visual quality; decoupled feedback learning, which improves aesthetic appeal; and adversarial feedback learning, which optimizes inference speed. In-depth experiments and extensive user studies validate the superior performance of our proposed method in enhancing both the quality of generated models and their acceleration. For instance, UniFL surpasses ImageReward by 17% user preference in terms of generation quality and outperforms LCM and SDXL Turbo by 57% and 20% in 4-step inference. Moreover, we have verified the efficacy of our approach in downstream tasks, including Lora, ControlNet, and AnimateDiff.

UniFL：通過統一反饋學習來改善穩定擴散

UniFL: Improve Stable Diffusion via Unified Feedback Learning

摘要

Support