對抗性流模型
Adversarial Flow Models
November 27, 2025
作者: Shanchuan Lin, Ceyuan Yang, Zhijie Lin, Hao Chen, Haoqi Fan
cs.AI
摘要
我們提出對抗流模型,這類生成模型統一了對抗模型與流模型的架構。我們的方法支援原生單步或多步生成,並採用對抗目標進行訓練。有別於傳統GAN中生成器需學習噪聲與數據分佈間的任意傳輸方案,我們的生成器學習的是確定性噪聲到數據的映射,這與流匹配模型中的最優傳輸方案一致,從而顯著提升了對抗訓練的穩定性。此外,相較於基於一致性的方法,我們的模型直接學習單步或少步生成,無需通過概率流的中間時間步進行傳播,這節省了模型容量、減少訓練迭代次數,並避免了誤差累積。在ImageNet-256px的相同1NFE設定下,我們的B/2模型性能接近基於一致性的XL/2模型,而我們的XL/2模型更創下2.38的FID新紀錄。我們還展示了通過層數重複對56層與112層模型進行端到端訓練的可能性,在無需任何中間監督的條件下,僅需單次前向傳播即可分別達到2.08與1.94的FID,超越了對應的2NFE與4NFE模型表現。
English
We present adversarial flow models, a class of generative models that unifies adversarial models and flow models. Our method supports native one-step or multi-step generation and is trained using the adversarial objective. Unlike traditional GANs, where the generator learns an arbitrary transport plan between the noise and the data distributions, our generator learns a deterministic noise-to-data mapping, which is the same optimal transport as in flow-matching models. This significantly stabilizes adversarial training. Also, unlike consistency-based methods, our model directly learns one-step or few-step generation without needing to learn the intermediate timesteps of the probability flow for propagation. This saves model capacity, reduces training iterations, and avoids error accumulation. Under the same 1NFE setting on ImageNet-256px, our B/2 model approaches the performance of consistency-based XL/2 models, while our XL/2 model creates a new best FID of 2.38. We additionally show the possibility of end-to-end training of 56-layer and 112-layer models through depth repetition without any intermediate supervision, and achieve FIDs of 2.08 and 1.94 using a single forward pass, surpassing their 2NFE and 4NFE counterparts.