自对抗单步生成:条件偏移法
Self-Adversarial One Step Generation via Condition Shifting
April 14, 2026
作者: Deyuan Liu, Peng Sun, Yansen Han, Zhenglin Cheng, Chuyan Chen, Tao Lin
cs.AI
摘要
当前文本到图像合成领域正朝着单步采样的方向发展,但现有方法仍面临保真度、推理速度与训练效率之间的三重权衡。依赖外部判别器的方法虽能提升单步生成质量,却常伴随训练不稳定、高GPU内存占用和收敛缓慢等问题,不利于模型扩展与参数高效调优。相比之下,基于回归的蒸馏与一致性目标更易优化,但在单步约束下通常会丢失细节特征。我们提出APEX框架,其核心理论突破在于:通过条件偏移可从流模型中内生提取对抗性校正信号。该技术构建的偏移条件分支,其速度场可作为模型当前生成分布的独立估计量,所产生的梯度经证明与GAN目标对齐,从而替代了导致梯度消失的样本依赖型判别器。这种无判别器设计保持架构不变,使APEX成为兼容全参数与LoRA调优的即插即用方案。实验表明,我们的6亿参数模型在单步生成质量上超越120亿参数的FLUX-Schnell(参数量为其1/20)。基于Qwen-Image 200亿参数的LoRA调优中,APEX仅用6小时即在单步推理时达到0.89的GenEval分数,超越原50步教师模型(0.87),实现15.33倍推理加速。代码已开源:https://github.com/LINs-lab/APEX。
English
The push for efficient text to image synthesis has moved the field toward one step sampling, yet existing methods still face a three way tradeoff among fidelity, inference speed, and training efficiency. Approaches that rely on external discriminators can sharpen one step performance, but they often introduce training instability, high GPU memory overhead, and slow convergence, which complicates scaling and parameter efficient tuning. In contrast, regression based distillation and consistency objectives are easier to optimize, but they typically lose fine details when constrained to a single step. We present APEX, built on a key theoretical insight: adversarial correction signals can be extracted endogenously from a flow model through condition shifting. Using a transformation creates a shifted condition branch whose velocity field serves as an independent estimator of the model's current generation distribution, yielding a gradient that is provably GAN aligned, replacing the sample dependent discriminator terms that cause gradient vanishing. This discriminator free design is architecture preserving, making APEX a plug and play framework compatible with both full parameter and LoRA based tuning. Empirically, our 0.6B model surpasses FLUX-Schnell 12B (20times more parameters) in one step quality. With LoRA tuning on Qwen-Image 20B, APEX reaches a GenEval score of 0.89 at NFE=1 in 6 hours, surpassing the original 50-step teacher (0.87) and providing a 15.33times inference speedup. Code is available https://github.com/LINs-lab/APEX.