ChatPaper.aiChatPaper

KLASS:基于KL引导的掩码扩散模型快速推理

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

November 7, 2025
作者: Seo Hyun Kim, Sunwoo Hong, Hojung Jung, Youngrok Park, Se-Young Yun
cs.AI

摘要

掩码扩散模型在语言生成等多项任务中已展现出卓越性能。然而,由于其迭代优化机制,推理过程常受限于缓慢且固定的采样速度。为解决此问题,我们提出基于KL自适应稳定性采样(KLASS)的快速高效采样方法,该方法利用词元级KL散度识别稳定高置信度的预测结果。通过在不增加模型训练成本的前提下实现单次迭代中多词元并行解掩,本方法在保持生成质量的同时显著提升生成速度。在推理基准测试中,KLASS相比标准贪婪解码实现了最高2.78倍的实际加速,且性能表现更优,在基于扩散的采样器中达到领先水平。我们进一步在文本、图像及分子生成等多领域验证KLASS的有效性,证明其可作为跨模型的通用采样器广泛应用。
English
Masked diffusion models have demonstrated competitive results on various tasks including language generation. However, due to its iterative refinement process, the inference is often bottlenecked by slow and static sampling speed. To overcome this problem, we introduce `KL-Adaptive Stability Sampling' (KLASS), a fast yet effective sampling method that exploits token-level KL divergence to identify stable, high-confidence predictions. By unmasking multiple tokens in each iteration without any additional model training, our approach speeds up generation significantly while maintaining sample quality. On reasoning benchmarks, KLASS achieves up to 2.78times wall-clock speedups while improving performance over standard greedy decoding, attaining state-of-the-art results among diffusion-based samplers. We further validate KLASS across diverse domains, including text, image, and molecular generation, showing its effectiveness as a broadly applicable sampler across different models.
PDF352December 2, 2025