时间是关键特征:探索扩散语言模型中的时序动态性
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models
August 12, 2025
作者: Wen Wang, Bozhen Fang, Chenchen Jing, Yongliang Shen, Yangyi Shen, Qiuyu Wang, Hao Ouyang, Hao Chen, Chunhua Shen
cs.AI
摘要
扩散大语言模型(dLLMs)通过迭代去噪生成文本,然而当前的解码策略舍弃了丰富的中间预测,仅保留最终输出。我们的研究揭示了一个关键现象——时间振荡,即正确答案常在中间过程出现,但在后续去噪步骤中被覆盖。为解决这一问题,我们提出了两种利用时间一致性的互补方法:1)时间自一致性投票,一种无需训练、在测试时应用的解码策略,通过聚合去噪步骤中的预测来选择最一致的输出;2)一种称为时间一致性强化的训练后方法,它使用时间语义熵(TSE)——衡量中间预测间语义稳定性的指标——作为奖励信号,以促进生成稳定性。多项基准测试的实证结果验证了我们方法的有效性。仅使用负TSE奖励,我们在Countdown数据集上观察到现有dLLM平均提升了24.7%。结合准确率奖励,我们分别在GSM8K、MATH500、SVAMP和Countdown上实现了2.0%、4.3%、6.6%和25.3%的绝对提升。我们的发现强调了dLLMs中时间动态的未开发潜力,并提供了两种简单而有效的工具来利用它们。
English
Diffusion large language models (dLLMs) generate text through iterative
denoising, yet current decoding strategies discard rich intermediate
predictions in favor of the final output. Our work here reveals a critical
phenomenon, temporal oscillation, where correct answers often emerge in the
middle process, but are overwritten in later denoising steps. To address this
issue, we introduce two complementary methods that exploit temporal
consistency: 1) Temporal Self-Consistency Voting, a training-free, test-time
decoding strategy that aggregates predictions across denoising steps to select
the most consistent output; and 2) a post-training method termed Temporal
Consistency Reinforcement, which uses Temporal Semantic Entropy (TSE), a
measure of semantic stability across intermediate predictions, as a reward
signal to encourage stable generations. Empirical results across multiple
benchmarks demonstrate the effectiveness of our approach. Using the negative
TSE reward alone, we observe a remarkable average improvement of 24.7% on the
Countdown dataset over an existing dLLM. Combined with the accuracy reward, we
achieve absolute gains of 2.0% on GSM8K, 4.3% on MATH500, 6.6% on SVAMP, and
25.3% on Countdown, respectively. Our findings underscore the untapped
potential of temporal dynamics in dLLMs and offer two simple yet effective
tools to harness them.