基于动态低置信度掩码的自适应无分类器引导

摘要

无分类器引导（CFG）通过融合条件预测与无条件预测，显著提升了生成模型的可控性。然而，标准CFG通常采用静态的无条件输入，这在模型不确定性动态变化的迭代生成过程中可能并非最优。我们提出自适应无分类器引导（A-CFG），这是一种新颖的方法，它利用模型的即时预测置信度来定制无条件输入。在迭代（掩码）扩散语言模型的每一步，A-CFG识别当前生成序列中模型置信度较低的词元。这些词元会被临时重新掩码，以创建动态的、局部化的无条件输入。这使CFG的纠正作用精准聚焦于模糊区域，从而实现更有效的引导。我们将A-CFG集成到最先进的掩码扩散语言模型中，并验证了其有效性。在多种语言生成基准测试上的实验表明，A-CFG相较于标准CFG带来了显著提升，例如在GPQA上取得了3.9分的增益。我们的工作凸显了在迭代生成中根据模型不确定性动态调整引导机制的优势。

English

Classifier-Free Guidance (CFG) significantly enhances controllability in generative models by interpolating conditional and unconditional predictions. However, standard CFG often employs a static unconditional input, which can be suboptimal for iterative generation processes where model uncertainty varies dynamically. We introduce Adaptive Classifier-Free Guidance (A-CFG), a novel method that tailors the unconditional input by leveraging the model's instantaneous predictive confidence. At each step of an iterative (masked) diffusion language model, A-CFG identifies tokens in the currently generated sequence for which the model exhibits low confidence. These tokens are temporarily re-masked to create a dynamic, localized unconditional input. This focuses CFG's corrective influence precisely on areas of ambiguity, leading to more effective guidance. We integrate A-CFG into a state-of-the-art masked diffusion language model and demonstrate its efficacy. Experiments on diverse language generation benchmarks show that A-CFG yields substantial improvements over standard CFG, achieving, for instance, a 3.9 point gain on GPQA. Our work highlights the benefit of dynamically adapting guidance mechanisms to model uncertainty in iterative generation.

基于动态低置信度掩码的自适应无分类器引导

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

摘要

Support