KLASS:基於KL散度引導的遮罩擴散模型快速推論
KLASS: KL-Guided Fast Inference in Masked Diffusion Models
November 7, 2025
作者: Seo Hyun Kim, Sunwoo Hong, Hojung Jung, Youngrok Park, Se-Young Yun
cs.AI
摘要
遮罩擴散模型在語言生成等多項任務中已展現出競爭力。然而,由於其迭代優化特性,推論過程常受制於緩慢且固定的採樣速度。為解決此問題,我們提出「KL自適應穩定性採樣」(KLASS),這是一種利用詞元級KL散度識別穩定高置信度預測的快速有效採樣方法。該方法無需額外模型訓練,即可在每次迭代中同時解鎖多個詞元,在保持生成質量的同時顯著加速生成過程。在推理基準測試中,KLASS相比標準貪婪解碼實現了最高2.78倍的實時加速,且性能有所提升,在基於擴散的採樣器中達到最先進水平。我們進一步在文本、圖像及分子生成等多領域驗證KLASS的有效性,證明其作為跨模型通用採樣器的廣泛適用性。
English
Masked diffusion models have demonstrated competitive results on various
tasks including language generation. However, due to its iterative refinement
process, the inference is often bottlenecked by slow and static sampling speed.
To overcome this problem, we introduce `KL-Adaptive Stability Sampling'
(KLASS), a fast yet effective sampling method that exploits token-level KL
divergence to identify stable, high-confidence predictions. By unmasking
multiple tokens in each iteration without any additional model training, our
approach speeds up generation significantly while maintaining sample quality.
On reasoning benchmarks, KLASS achieves up to 2.78times wall-clock speedups
while improving performance over standard greedy decoding, attaining
state-of-the-art results among diffusion-based samplers. We further validate
KLASS across diverse domains, including text, image, and molecular generation,
showing its effectiveness as a broadly applicable sampler across different
models.