均衡匹配：基于隐式能量模型的生成建模

摘要

我们提出了平衡匹配（Equilibrium Matching, EqM），这是一个从平衡动力学视角构建的生成建模框架。EqM摒弃了传统扩散模型和基于流的生成模型中非平衡、时间条件依赖的动力学机制，转而学习一个隐含能量景观的平衡梯度。通过这种方法，在推理阶段我们可以采用基于优化的采样过程，其中样本通过在学习到的景观上进行梯度下降获得，且支持可调节的步长、自适应优化器以及自适应计算。实验表明，EqM在生成性能上超越了扩散/流模型，在ImageNet 256×256数据集上达到了1.90的FID分数。理论上，EqM也被证明能够从数据流形中有效学习和采样。除了生成任务，EqM作为一个灵活框架，还能自然地处理部分噪声图像去噪、异常检测（OOD detection）以及图像合成等任务。通过用统一的平衡景观取代时间条件依赖的速度场，EqM在流模型与基于能量的模型之间架起了更紧密的桥梁，并为优化驱动的推理提供了一条简洁路径。

English

We introduce Equilibrium Matching (EqM), a generative modeling framework built from an equilibrium dynamics perspective. EqM discards the non-equilibrium, time-conditional dynamics in traditional diffusion and flow-based generative models and instead learns the equilibrium gradient of an implicit energy landscape. Through this approach, we can adopt an optimization-based sampling process at inference time, where samples are obtained by gradient descent on the learned landscape with adjustable step sizes, adaptive optimizers, and adaptive compute. EqM surpasses the generation performance of diffusion/flow models empirically, achieving an FID of 1.90 on ImageNet 256times256. EqM is also theoretically justified to learn and sample from the data manifold. Beyond generation, EqM is a flexible framework that naturally handles tasks including partially noised image denoising, OOD detection, and image composition. By replacing time-conditional velocities with a unified equilibrium landscape, EqM offers a tighter bridge between flow and energy-based models and a simple route to optimization-driven inference.

均衡匹配：基于隐式能量模型的生成建模

Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models

摘要

Support