對齊你的切線：通過流形對齊切線訓練更好的連續性模型

摘要

隨著擴散模型和流匹配模型在生成性能上達到頂尖水平，研究界的關注點轉向了在不犧牲樣品質量的前提下減少推理時間。一致性模型（Consistency Models, CMs）通過在擴散或概率流常微分方程（PF-ODE）軌跡上訓練以保持一致性，實現了一步或兩步的流或擴散採樣。然而，CMs通常需要長時間的訓練和大批量數據來獲得競爭性的樣本質量。本文中，我們考察了CMs在接近收斂時的訓練動態，發現CM切線——即CM輸出的更新方向——具有較大的振盪性，其運動方向平行於數據流形而非朝向流形。為減輕切線振盪，我們提出了一種新的損失函數，稱為流形特徵距離（Manifold Feature Distance, MFD），它提供了對齊於流形的切線，指向數據流形。因此，我們的方法——命名為“對齊你的切線”（Align Your Tangent, AYT）——能夠將CM的訓練速度提升數個數量級，甚至超越學習感知圖像塊相似度度量（LPIPS）。此外，我們發現我們的損失函數允許在極小批量數據下進行訓練，而不影響樣本質量。代碼見：https://github.com/1202kbs/AYT

English

With diffusion and flow matching models achieving state-of-the-art generating performance, the interest of the community now turned to reducing the inference time without sacrificing sample quality. Consistency Models (CMs), which are trained to be consistent on diffusion or probability flow ordinary differential equation (PF-ODE) trajectories, enable one or two-step flow or diffusion sampling. However, CMs typically require prolonged training with large batch sizes to obtain competitive sample quality. In this paper, we examine the training dynamics of CMs near convergence and discover that CM tangents -- CM output update directions -- are quite oscillatory, in the sense that they move parallel to the data manifold, not towards the manifold. To mitigate oscillatory tangents, we propose a new loss function, called the manifold feature distance (MFD), which provides manifold-aligned tangents that point toward the data manifold. Consequently, our method -- dubbed Align Your Tangent (AYT) -- can accelerate CM training by orders of magnitude and even out-perform the learned perceptual image patch similarity metric (LPIPS). Furthermore, we find that our loss enables training with extremely small batch sizes without compromising sample quality. Code: https://github.com/1202kbs/AYT

對齊你的切線：通過流形對齊切線訓練更好的連續性模型

Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents

摘要

Support