通过GS-Jacobi迭代加速TarFlow采样
Accelerate TarFlow Sampling with GS-Jacobi Iteration
May 19, 2025
作者: Ben Liu, Zhen Qin
cs.AI
摘要
图像生成模型已获得广泛应用。以TarFlow模型为例,它结合了Transformer架构与归一化流模型,在多个基准测试中取得了顶尖成果。然而,由于因果形式的注意力机制需要顺序计算,TarFlow的采样过程极为缓慢。本文展示,通过一系列优化策略,采用高斯-赛德尔-雅可比(简称GS-Jacobi)迭代方法,可大幅加速TarFlow采样。具体而言,我们发现TarFlow模型中的各模块重要性各异:少数模块在图像生成任务中起主导作用,而其他模块贡献较小;部分模块对初始值敏感,易发生数值溢出,而另一些则相对稳健。基于这两点特性,我们提出了收敛排序度量(CRM)和初始猜测度量(IGM):CRM用于判断TarFlow模块是“简单”(迭代次数少即收敛)还是“复杂”(需更多迭代);IGM则用于评估迭代初始值的好坏。在四个TarFlow模型上的实验表明,GS-Jacobi采样在保持生成图像质量(以FID衡量)的同时,显著提升了采样效率,在Img128cond、AFHQ、Img64uncond和Img64cond上分别实现了4.53倍、5.32倍、2.96倍和2.51倍的加速,且未降低FID分数或样本质量。代码和检查点可在https://github.com/encoreus/GS-Jacobi_for_TarFlow获取。
English
Image generation models have achieved widespread applications. As an
instance, the TarFlow model combines the transformer architecture with
Normalizing Flow models, achieving state-of-the-art results on multiple
benchmarks. However, due to the causal form of attention requiring sequential
computation, TarFlow's sampling process is extremely slow. In this paper, we
demonstrate that through a series of optimization strategies, TarFlow sampling
can be greatly accelerated by using the Gauss-Seidel-Jacobi (abbreviated as
GS-Jacobi) iteration method. Specifically, we find that blocks in the TarFlow
model have varying importance: a small number of blocks play a major role in
image generation tasks, while other blocks contribute relatively little; some
blocks are sensitive to initial values and prone to numerical overflow, while
others are relatively robust. Based on these two characteristics, we propose
the Convergence Ranking Metric (CRM) and the Initial Guessing Metric (IGM): CRM
is used to identify whether a TarFlow block is "simple" (converges in few
iterations) or "tough" (requires more iterations); IGM is used to evaluate
whether the initial value of the iteration is good. Experiments on four TarFlow
models demonstrate that GS-Jacobi sampling can significantly enhance sampling
efficiency while maintaining the quality of generated images (measured by FID),
achieving speed-ups of 4.53x in Img128cond, 5.32x in AFHQ, 2.96x in
Img64uncond, and 2.51x in Img64cond without degrading FID scores or sample
quality. Code and checkpoints are accessible on
https://github.com/encoreus/GS-Jacobi_for_TarFlowSummary
AI-Generated Summary