修正擴散:在修正流中,直線並非所需
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
October 9, 2024
作者: Fu-Yun Wang, Ling Yang, Zhaoyang Huang, Mengdi Wang, Hongsheng Li
cs.AI
摘要
擴散模型在視覺生成方面取得了巨大進步,但由於解決生成式 ODE 的計算密集性,導致生成速度緩慢。矯正流是一個廣泛認可的解決方案,通過使 ODE 路徑變直來提高生成速度。其關鍵組成部分包括:1)使用擴散形式的流匹配,2)採用粗體 v-預測,以及3)執行矯正(又稱重新流動)。本文主張,矯正的成功主要在於使用預訓練的擴散模型來獲取噪聲和樣本的匹配對,然後通過這些匹配的噪聲-樣本對進行重新訓練。基於此,組成部分1)和2)是不必要的。此外,我們強調,直線性不是矯正的基本訓練目標;相反,它是流匹配模型的一個特定案例。更關鍵的訓練目標是實現一個一階近似的 ODE 路徑,對於像 DDPM 和 Sub-VP 這樣的模型,這種路徑在本質上是彎曲的。基於這一見解,我們提出了矯正擴散,將矯正的設計空間和應用範圍擴展到更廣泛的擴散模型類別,而不僅僅限於流匹配模型。我們在 Stable Diffusion v1-5 和 Stable Diffusion XL 上驗證了我們的方法。我們的方法不僅極大簡化了基於矯正流的先前作品(例如 InstaFlow)的訓練過程,而且在訓練成本更低的情況下實現了更優異的性能。我們的代碼可在 https://github.com/G-U-N/Rectified-Diffusion 找到。
English
Diffusion models have greatly improved visual generation but are hindered by
slow generation speed due to the computationally intensive nature of solving
generative ODEs. Rectified flow, a widely recognized solution, improves
generation speed by straightening the ODE path. Its key components include: 1)
using the diffusion form of flow-matching, 2) employing boldsymbol
v-prediction, and 3) performing rectification (a.k.a. reflow). In this paper,
we argue that the success of rectification primarily lies in using a pretrained
diffusion model to obtain matched pairs of noise and samples, followed by
retraining with these matched noise-sample pairs. Based on this, components 1)
and 2) are unnecessary. Furthermore, we highlight that straightness is not an
essential training target for rectification; rather, it is a specific case of
flow-matching models. The more critical training target is to achieve a
first-order approximate ODE path, which is inherently curved for models like
DDPM and Sub-VP. Building on this insight, we propose Rectified Diffusion,
which generalizes the design space and application scope of rectification to
encompass the broader category of diffusion models, rather than being
restricted to flow-matching models. We validate our method on Stable Diffusion
v1-5 and Stable Diffusion XL. Our method not only greatly simplifies the
training procedure of rectified flow-based previous works (e.g., InstaFlow) but
also achieves superior performance with even lower training cost. Our code is
available at https://github.com/G-U-N/Rectified-Diffusion.Summary
AI-Generated Summary