对齐你的切线：通过流形对齐切线训练更优的一致性模型

摘要

随着扩散模型和流匹配模型在生成性能上达到顶尖水平，研究界的关注点转向了在不牺牲样本质量的前提下减少推理时间。一致性模型（CMs）通过在扩散或概率流常微分方程（PF-ODE）轨迹上训练以保持一致性，能够实现一步或两步的流或扩散采样。然而，CMs通常需要长时间训练和大批量数据才能获得具有竞争力的样本质量。本文深入探讨了CMs在接近收敛时的训练动态，发现CM切线——即CM输出更新的方向——呈现出显著的振荡特性，表现为它们平行于数据流形移动而非朝向流形。为了缓解这种振荡切线，我们提出了一种新的损失函数，称为流形特征距离（MFD），它提供了指向数据流形的流形对齐切线。因此，我们的方法——命名为“对齐你的切线”（AYT）——能够将CM训练速度提升数个数量级，甚至超越学习感知图像块相似度度量（LPIPS）。此外，我们发现该损失函数支持在极小批量下训练而不影响样本质量。代码已发布：https://github.com/1202kbs/AYT。

English

With diffusion and flow matching models achieving state-of-the-art generating performance, the interest of the community now turned to reducing the inference time without sacrificing sample quality. Consistency Models (CMs), which are trained to be consistent on diffusion or probability flow ordinary differential equation (PF-ODE) trajectories, enable one or two-step flow or diffusion sampling. However, CMs typically require prolonged training with large batch sizes to obtain competitive sample quality. In this paper, we examine the training dynamics of CMs near convergence and discover that CM tangents -- CM output update directions -- are quite oscillatory, in the sense that they move parallel to the data manifold, not towards the manifold. To mitigate oscillatory tangents, we propose a new loss function, called the manifold feature distance (MFD), which provides manifold-aligned tangents that point toward the data manifold. Consequently, our method -- dubbed Align Your Tangent (AYT) -- can accelerate CM training by orders of magnitude and even out-perform the learned perceptual image patch similarity metric (LPIPS). Furthermore, we find that our loss enables training with extremely small batch sizes without compromising sample quality. Code: https://github.com/1202kbs/AYT

对齐你的切线：通过流形对齐切线训练更优的一致性模型

Align Your Tangent: Training Better Consistency Models via Manifold-Aligned Tangents

摘要

Support