基于动态低置信度掩码的自适应无分类器引导
Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking
May 26, 2025
作者: Pengxiang Li, Shilin Yan, Joey Tsai, Renrui Zhang, Ruichuan An, Ziyu Guo, Xiaowei Gao
cs.AI
摘要
无分类器引导(CFG)通过融合条件预测与无条件预测,显著提升了生成模型的可控性。然而,标准CFG通常采用静态的无条件输入,这在模型不确定性动态变化的迭代生成过程中可能并非最优。我们提出自适应无分类器引导(A-CFG),这是一种新颖的方法,它利用模型的即时预测置信度来定制无条件输入。在迭代(掩码)扩散语言模型的每一步,A-CFG识别当前生成序列中模型置信度较低的词元。这些词元会被临时重新掩码,以创建动态的、局部化的无条件输入。这使CFG的纠正作用精准聚焦于模糊区域,从而实现更有效的引导。我们将A-CFG集成到最先进的掩码扩散语言模型中,并验证了其有效性。在多种语言生成基准测试上的实验表明,A-CFG相较于标准CFG带来了显著提升,例如在GPQA上取得了3.9分的增益。我们的工作凸显了在迭代生成中根据模型不确定性动态调整引导机制的优势。
English
Classifier-Free Guidance (CFG) significantly enhances controllability in
generative models by interpolating conditional and unconditional predictions.
However, standard CFG often employs a static unconditional input, which can be
suboptimal for iterative generation processes where model uncertainty varies
dynamically. We introduce Adaptive Classifier-Free Guidance (A-CFG), a novel
method that tailors the unconditional input by leveraging the model's
instantaneous predictive confidence. At each step of an iterative (masked)
diffusion language model, A-CFG identifies tokens in the currently generated
sequence for which the model exhibits low confidence. These tokens are
temporarily re-masked to create a dynamic, localized unconditional input. This
focuses CFG's corrective influence precisely on areas of ambiguity, leading to
more effective guidance. We integrate A-CFG into a state-of-the-art masked
diffusion language model and demonstrate its efficacy. Experiments on diverse
language generation benchmarks show that A-CFG yields substantial improvements
over standard CFG, achieving, for instance, a 3.9 point gain on GPQA. Our work
highlights the benefit of dynamically adapting guidance mechanisms to model
uncertainty in iterative generation.Summary
AI-Generated Summary