加速TarFlow採樣之GS-Jacobi迭代法
Accelerate TarFlow Sampling with GS-Jacobi Iteration
May 19, 2025
作者: Ben Liu, Zhen Qin
cs.AI
摘要
影像生成模型已獲得廣泛應用。以TarFlow模型為例,其結合了Transformer架構與正規化流模型,在多項基準測試中達到了頂尖水準。然而,由於注意力機制的因果形式需要序列計算,TarFlow的採樣過程極為緩慢。本文展示,通過一系列優化策略,利用高斯-賽德爾-雅可比(簡稱GS-Jacobi)迭代法,可大幅加速TarFlow的採樣過程。具體而言,我們發現TarFlow模型中的各區塊具有不同的重要性:少數區塊在影像生成任務中扮演主要角色,而其他區塊貢獻相對較小;部分區塊對初始值敏感且易於數值溢出,而另一些則相對穩健。基於這兩大特性,我們提出了收斂排名指標(CRM)與初始猜測指標(IGM):CRM用於判斷TarFlow區塊是“簡單”(在少數迭代內收斂)還是“困難”(需要更多迭代);IGM則用於評估迭代初始值的好壞。在四個TarFlow模型上的實驗表明,GS-Jacobi採樣在保持生成影像質量(以FID衡量)的同時,能顯著提升採樣效率,於Img128cond、AFHQ、Img64uncond及Img64cond中分別實現了4.53倍、5.32倍、2.96倍及2.51倍的加速,且未降低FID分數或樣本質量。相關代碼與檢查點可於https://github.com/encoreus/GS-Jacobi_for_TarFlow獲取。
English
Image generation models have achieved widespread applications. As an
instance, the TarFlow model combines the transformer architecture with
Normalizing Flow models, achieving state-of-the-art results on multiple
benchmarks. However, due to the causal form of attention requiring sequential
computation, TarFlow's sampling process is extremely slow. In this paper, we
demonstrate that through a series of optimization strategies, TarFlow sampling
can be greatly accelerated by using the Gauss-Seidel-Jacobi (abbreviated as
GS-Jacobi) iteration method. Specifically, we find that blocks in the TarFlow
model have varying importance: a small number of blocks play a major role in
image generation tasks, while other blocks contribute relatively little; some
blocks are sensitive to initial values and prone to numerical overflow, while
others are relatively robust. Based on these two characteristics, we propose
the Convergence Ranking Metric (CRM) and the Initial Guessing Metric (IGM): CRM
is used to identify whether a TarFlow block is "simple" (converges in few
iterations) or "tough" (requires more iterations); IGM is used to evaluate
whether the initial value of the iteration is good. Experiments on four TarFlow
models demonstrate that GS-Jacobi sampling can significantly enhance sampling
efficiency while maintaining the quality of generated images (measured by FID),
achieving speed-ups of 4.53x in Img128cond, 5.32x in AFHQ, 2.96x in
Img64uncond, and 2.51x in Img64cond without degrading FID scores or sample
quality. Code and checkpoints are accessible on
https://github.com/encoreus/GS-Jacobi_for_TarFlowSummary
AI-Generated Summary