均衡匹配:基於隱式能量模型的生成建模
Equilibrium Matching: Generative Modeling with Implicit Energy-Based Models
October 2, 2025
作者: Runqian Wang, Yilun Du
cs.AI
摘要
我們提出了平衡匹配(Equilibrium Matching, EqM),這是一個基於平衡動力學視角構建的生成建模框架。EqM摒棄了傳統擴散模型和基於流的生成模型中非平衡、時間條件性的動力學,轉而學習隱含能量景觀的平衡梯度。通過這種方法,我們可以在推理時採用基於優化的採樣過程,其中樣本是通過在學習到的景觀上進行梯度下降獲得的,並可調整步長、使用自適應優化器和自適應計算。EqM在生成性能上超越了擴散/流模型,在ImageNet 256×256上達到了1.90的FID分數。EqM在理論上也被證明能夠從數據流形中學習和採樣。除了生成任務,EqM還是一個靈活的框架,自然能夠處理包括部分噪聲圖像去噪、OOD檢測和圖像合成等任務。通過用統一的平衡景觀取代時間條件性的速度,EqM在流模型和基於能量的模型之間架起了更緊密的橋樑,並提供了一條通往優化驅動推理的簡單路徑。
English
We introduce Equilibrium Matching (EqM), a generative modeling framework
built from an equilibrium dynamics perspective. EqM discards the
non-equilibrium, time-conditional dynamics in traditional diffusion and
flow-based generative models and instead learns the equilibrium gradient of an
implicit energy landscape. Through this approach, we can adopt an
optimization-based sampling process at inference time, where samples are
obtained by gradient descent on the learned landscape with adjustable step
sizes, adaptive optimizers, and adaptive compute. EqM surpasses the generation
performance of diffusion/flow models empirically, achieving an FID of 1.90 on
ImageNet 256times256. EqM is also theoretically justified to learn and
sample from the data manifold. Beyond generation, EqM is a flexible framework
that naturally handles tasks including partially noised image denoising, OOD
detection, and image composition. By replacing time-conditional velocities with
a unified equilibrium landscape, EqM offers a tighter bridge between flow and
energy-based models and a simple route to optimization-driven inference.