統一連續生成模型
Unified Continuous Generative Models
May 12, 2025
作者: Peng Sun, Yi Jiang, Tao Lin
cs.AI
摘要
近期,連續生成模型領域取得了顯著進展,其中包括多步方法如擴散模型與流匹配(通常需要8至1000次採樣步驟)以及少步方法如一致性模型(通常1至8步),這些方法展現了令人矚目的生成性能。然而,現有研究往往將這些方法視為獨立範式,導致訓練與採樣方法各自為政。我們提出了一個統一的框架,用於訓練、採樣及分析這些模型。我們實現的統一連續生成模型訓練器與採樣器(UCGM-{T,S})達到了業界領先(SOTA)的性能。例如,在ImageNet 256x256數據集上,使用675M參數的擴散變壓器,UCGM-T訓練的多步模型在20步內實現了1.30的FID,而少步模型僅需2步便達到了1.42的FID。此外,將UCGM-S應用於預訓練模型(先前在250步時FID為1.26),僅需40步便將性能提升至1.06的FID。代碼已公開於:https://github.com/LINs-lab/UCGM。
English
Recent advances in continuous generative models, including multi-step
approaches like diffusion and flow-matching (typically requiring 8-1000
sampling steps) and few-step methods such as consistency models (typically 1-8
steps), have demonstrated impressive generative performance. However, existing
work often treats these approaches as distinct paradigms, resulting in separate
training and sampling methodologies. We introduce a unified framework for
training, sampling, and analyzing these models. Our implementation, the Unified
Continuous Generative Models Trainer and Sampler (UCGM-{T,S}), achieves
state-of-the-art (SOTA) performance. For example, on ImageNet 256x256 using a
675M diffusion transformer, UCGM-T trains a multi-step model achieving 1.30 FID
in 20 steps and a few-step model reaching 1.42 FID in just 2 steps.
Additionally, applying UCGM-S to a pre-trained model (previously 1.26 FID at
250 steps) improves performance to 1.06 FID in only 40 steps. Code is available
at: https://github.com/LINs-lab/UCGM.Summary
AI-Generated Summary