ChatPaper.aiChatPaper

统一连续生成模型

Unified Continuous Generative Models

May 12, 2025
作者: Peng Sun, Yi Jiang, Tao Lin
cs.AI

摘要

近期,连续生成模型领域取得了显著进展,其中包括多步方法(如扩散模型和流匹配,通常需要8至1000个采样步骤)以及少步方法(如一致性模型,通常仅需1至8步),这些方法展现了卓越的生成性能。然而,现有研究往往将这些方法视为独立范式,导致训练和采样方法各自为政。我们提出了一种统一的框架,用于训练、采样和分析这些模型。我们的实现——统一连续生成模型训练与采样器(UCGM-{T,S}),达到了业界领先(SOTA)的性能。例如,在ImageNet 256x256数据集上,使用675M参数的扩散变换器,UCGM-T训练的多步模型在20步内实现了1.30的FID分数,而少步模型仅需2步便达到了1.42的FID。此外,将UCGM-S应用于一个预训练模型(此前在250步时FID为1.26),仅用40步就将性能提升至1.06的FID。代码已发布于:https://github.com/LINs-lab/UCGM。
English
Recent advances in continuous generative models, including multi-step approaches like diffusion and flow-matching (typically requiring 8-1000 sampling steps) and few-step methods such as consistency models (typically 1-8 steps), have demonstrated impressive generative performance. However, existing work often treats these approaches as distinct paradigms, resulting in separate training and sampling methodologies. We introduce a unified framework for training, sampling, and analyzing these models. Our implementation, the Unified Continuous Generative Models Trainer and Sampler (UCGM-{T,S}), achieves state-of-the-art (SOTA) performance. For example, on ImageNet 256x256 using a 675M diffusion transformer, UCGM-T trains a multi-step model achieving 1.30 FID in 20 steps and a few-step model reaching 1.42 FID in just 2 steps. Additionally, applying UCGM-S to a pre-trained model (previously 1.26 FID at 250 steps) improves performance to 1.06 FID in only 40 steps. Code is available at: https://github.com/LINs-lab/UCGM.

Summary

AI-Generated Summary

PDF302May 13, 2025