時間即特徵:探索擴散語言模型中的時序動態性
Time Is a Feature: Exploiting Temporal Dynamics in Diffusion Language Models
August 12, 2025
作者: Wen Wang, Bozhen Fang, Chenchen Jing, Yongliang Shen, Yangyi Shen, Qiuyu Wang, Hao Ouyang, Hao Chen, Chunhua Shen
cs.AI
摘要
擴散式大型語言模型(dLLMs)通過迭代去噪生成文本,然而當前的解碼策略捨棄了豐富的中間預測,僅保留最終輸出。我們的研究揭示了一個關鍵現象——時間振盪,即正確答案常在處理過程中間出現,卻在後續去噪步驟中被覆蓋。為解決這一問題,我們引入了兩種互補的方法來利用時間一致性:1)時間自一致性投票,這是一種無需訓練、在測試時應用的解碼策略,通過聚合去噪步驟中的預測來選擇最一致的輸出;以及2)一種稱為時間一致性強化的訓練後方法,它使用時間語義熵(TSE)——衡量中間預測間語義穩定性的指標——作為獎勵信號,以鼓勵生成穩定的結果。多個基準測試的實證結果證明了我們方法的有效性。僅使用負TSE獎勵,我們在Countdown數據集上觀察到了相較於現有dLLM平均24.7%的顯著提升。結合準確率獎勵,我們在GSM8K、MATH500、SVAMP和Countdown上分別實現了2.0%、4.3%、6.6%和25.3%的絕對增益。我們的研究成果強調了dLLMs中時間動態的未開發潛力,並提供了兩種簡單而有效的工具來利用這些潛力。
English
Diffusion large language models (dLLMs) generate text through iterative
denoising, yet current decoding strategies discard rich intermediate
predictions in favor of the final output. Our work here reveals a critical
phenomenon, temporal oscillation, where correct answers often emerge in the
middle process, but are overwritten in later denoising steps. To address this
issue, we introduce two complementary methods that exploit temporal
consistency: 1) Temporal Self-Consistency Voting, a training-free, test-time
decoding strategy that aggregates predictions across denoising steps to select
the most consistent output; and 2) a post-training method termed Temporal
Consistency Reinforcement, which uses Temporal Semantic Entropy (TSE), a
measure of semantic stability across intermediate predictions, as a reward
signal to encourage stable generations. Empirical results across multiple
benchmarks demonstrate the effectiveness of our approach. Using the negative
TSE reward alone, we observe a remarkable average improvement of 24.7% on the
Countdown dataset over an existing dLLM. Combined with the accuracy reward, we
achieve absolute gains of 2.0% on GSM8K, 4.3% on MATH500, 6.6% on SVAMP, and
25.3% on Countdown, respectively. Our findings underscore the untapped
potential of temporal dynamics in dLLMs and offer two simple yet effective
tools to harness them.