ChatPaper.aiChatPaper

修正扩散:在修正流中,直线并非所需

Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow

October 9, 2024
作者: Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li
cs.AI

摘要

扩散模型在视觉生成方面取得了很大进展,但由于解决生成ODE的计算密集性质,生成速度较慢。经过广泛认可的解决方案——矫正流,通过使ODE路径变直来提高生成速度。其关键组成部分包括:1)使用流匹配的扩散形式,2)采用粗体v-预测,3)执行矫正(又称回流)。本文认为,矫正的成功主要在于使用预训练的扩散模型获取噪声和样本的匹配对,然后利用这些匹配的噪声-样本对进行重新训练。基于此,组件1)和2)是不必要的。此外,我们强调,直线并非矫正的必要训练目标;相反,它是流匹配模型的特定情况。更为关键的训练目标是实现一阶近似ODE路径,对于像DDPM和Sub-VP这样的模型,这种路径在本质上是曲线的。基于这一观点,我们提出了矫正扩散,将矫正的设计空间和应用范围推广到更广泛的扩散模型类别,而不仅限于流匹配模型。我们在Stable Diffusion v1-5和Stable Diffusion XL上验证了我们的方法。我们的方法不仅极大简化了基于矫正流的先前工作(例如InstaFlow)的训练过程,而且在更低的训练成本下取得了更优越的性能。我们的代码可在https://github.com/G-U-N/Rectified-Diffusion找到。
English
Diffusion models have greatly improved visual generation but are hindered by slow generation speed due to the computationally intensive nature of solving generative ODEs. Rectified flow, a widely recognized solution, improves generation speed by straightening the ODE path. Its key components include: 1) using the diffusion form of flow-matching, 2) employing boldsymbol v-prediction, and 3) performing rectification (a.k.a. reflow). In this paper, we argue that the success of rectification primarily lies in using a pretrained diffusion model to obtain matched pairs of noise and samples, followed by retraining with these matched noise-sample pairs. Based on this, components 1) and 2) are unnecessary. Furthermore, we highlight that straightness is not an essential training target for rectification; rather, it is a specific case of flow-matching models. The more critical training target is to achieve a first-order approximate ODE path, which is inherently curved for models like DDPM and Sub-VP. Building on this insight, we propose Rectified Diffusion, which generalizes the design space and application scope of rectification to encompass the broader category of diffusion models, rather than being restricted to flow-matching models. We validate our method on Stable Diffusion v1-5 and Stable Diffusion XL. Our method not only greatly simplifies the training procedure of rectified flow-based previous works (e.g., InstaFlow) but also achieves superior performance with even lower training cost. Our code is available at https://github.com/G-U-N/Rectified-Diffusion.

Summary

AI-Generated Summary

PDF183November 16, 2024