離散擴散模型的漏洞利用:突破採樣壁壘的確定性路徑
Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall
October 22, 2025
作者: Mingyu Jo, Jaesik Yoon, Justin Deschenaux, Caglar Gulcehre, Sungjin Ahn
cs.AI
摘要
離散擴散模型通過平行解碼為自回歸生成提供了極具前景的替代方案,但其存在採樣壁壘問題:一旦進行類別採樣,豐富的分佈資訊便會坍縮為單熱向量,無法在步驟間傳遞,迫使後續步驟僅能基於有限資訊運作。為緩解此問題,我們提出「漏洞穿越」機制——通過確定性潛在路徑保留分佈資訊的新穎簡易方法,據此構建漏洞穿越離散擴散模型(LDDMs)。採用自條件化策略進行高效訓練後,LDDMs實現顯著提升:生成困惑度相較既有基線降低達61%,縮小(甚至在某些情況下超越)與自回歸模型的差距,並生成更連貫的文本。應用於推理任務時,LDDMs在Countdown與Game of 24等算術基準測試中也表現出性能提升。這些結果同時表明,漏洞穿越機制能有效緩解停滯步與振盪現象,為高品質非自回歸文本生成提供了可擴展路徑。
English
Discrete diffusion models offer a promising alternative to autoregressive
generation through parallel decoding, but they suffer from a sampling wall:
once categorical sampling occurs, rich distributional information collapses
into one-hot vectors and cannot be propagated across steps, forcing subsequent
steps to operate with limited information. To mitigate this problem, we
introduce Loopholing, a novel and simple mechanism that preserves this
information via a deterministic latent pathway, leading to Loopholing Discrete
Diffusion Models (LDDMs). Trained efficiently with a self-conditioning
strategy, LDDMs achieve substantial gains-reducing generative perplexity by up
to 61% over prior baselines, closing (and in some cases surpassing) the gap
with autoregressive models, and producing more coherent text. Applied to
reasoning tasks, LDDMs also improve performance on arithmetic benchmarks such
as Countdown and Game of 24. These results also indicate that loopholing
mitigates idle steps and oscillations, providing a scalable path toward
high-quality non-autoregressive text generation.