連続的敵対的フロー模型

要旨

我々は、敵対的目標で学習された連続時間フローモデルである連続敵対的フローモデルを提案する。固定された平均二乗誤差基準を用いるフローマッチングとは異なり、本手法は学習済みの識別器を導入して学習を誘導する。この目標関数の変更により、異なる一般化分布が誘起され、経験的には目標データ分布により良く整合したサンプルを生成する。本手法は既存のフローマッチングモデルに対する学習後処理として主に提案されるが、スクラッチからのモデル学習も可能である。ImageNet 256px生成タスクにおいて、学習後処理は潜在空間SiTのガイダンスなしFIDを8.26から3.63へ、画素空間JiTのFIDを7.17から3.57へ大幅に改善した。またガイダンス付き生成も改善し、SiTのFIDを2.06から1.53へ、JiTのFIDを1.86から1.80へ低減した。テキストから画像への生成タスクでも評価を行い、GenEvalとDPGベンチマークの両方で改善された結果を得た。

English

We propose continuous adversarial flow models, a type of continuous-time flow model trained with an adversarial objective. Unlike flow matching, which uses a fixed mean-squared-error criterion, our approach introduces a learned discriminator to guide training. This change in objective induces a different generalized distribution, which empirically produces samples that are better aligned with the target data distribution. Our method is primarily proposed for post-training existing flow-matching models, although it can also train models from scratch. On the ImageNet 256px generation task, our post-training substantially improves the guidance-free FID of latent-space SiT from 8.26 to 3.63 and of pixel-space JiT from 7.17 to 3.57. It also improves guided generation, reducing FID from 2.06 to 1.53 for SiT and from 1.86 to 1.80 for JiT. We further evaluate our approach on text-to-image generation, where it achieves improved results on both the GenEval and DPG benchmarks.

連続的敵対的フロー模型

Continuous Adversarial Flow Models

要旨

Support