连续对抗流模型

摘要

我们提出连续对抗流模型，这是一种采用对抗目标训练的连续时间流模型。与使用固定均方误差准则的流匹配方法不同，我们的方法引入可学习的判别器来指导训练。这种目标函数的改变诱导出不同的广义分布，实验表明其生成的样本与目标数据分布具有更好的对齐效果。我们的方法主要针对现有流匹配模型的后训练优化提出，同时也支持从零开始训练模型。在ImageNet 256px生成任务中，后训练显著提升了无引导生成质量：潜在空间SiT模型的FID从8.26降至3.63，像素空间JiT模型的FID从7.17降至3.57。在引导生成方面也取得进步，SiT的FID从2.06降至1.53，JiT从1.86降至1.80。我们进一步在文生图任务上评估该方法，在GenEval和DPG基准测试中均取得了提升。

English

We propose continuous adversarial flow models, a type of continuous-time flow model trained with an adversarial objective. Unlike flow matching, which uses a fixed mean-squared-error criterion, our approach introduces a learned discriminator to guide training. This change in objective induces a different generalized distribution, which empirically produces samples that are better aligned with the target data distribution. Our method is primarily proposed for post-training existing flow-matching models, although it can also train models from scratch. On the ImageNet 256px generation task, our post-training substantially improves the guidance-free FID of latent-space SiT from 8.26 to 3.63 and of pixel-space JiT from 7.17 to 3.57. It also improves guided generation, reducing FID from 2.06 to 1.53 for SiT and from 1.86 to 1.80 for JiT. We further evaluate our approach on text-to-image generation, where it achieves improved results on both the GenEval and DPG benchmarks.