ChatPaper.aiChatPaper

離散擴散模型的漏洞利用:突破採樣壁壘的確定性路徑

Loopholing Discrete Diffusion: Deterministic Bypass of the Sampling Wall

October 22, 2025
作者: Mingyu Jo, Jaesik Yoon, Justin Deschenaux, Caglar Gulcehre, Sungjin Ahn
cs.AI

摘要

離散擴散模型通過平行解碼為自回歸生成提供了極具前景的替代方案,但其存在採樣壁壘問題:一旦進行類別採樣,豐富的分佈資訊便會坍縮為單熱向量,無法在步驟間傳遞,迫使後續步驟僅能基於有限資訊運作。為緩解此問題,我們提出「漏洞穿越」機制——通過確定性潛在路徑保留分佈資訊的新穎簡易方法,據此構建漏洞穿越離散擴散模型(LDDMs)。採用自條件化策略進行高效訓練後,LDDMs實現顯著提升:生成困惑度相較既有基線降低達61%,縮小(甚至在某些情況下超越)與自回歸模型的差距,並生成更連貫的文本。應用於推理任務時,LDDMs在Countdown與Game of 24等算術基準測試中也表現出性能提升。這些結果同時表明,漏洞穿越機制能有效緩解停滯步與振盪現象,為高品質非自回歸文本生成提供了可擴展路徑。
English
Discrete diffusion models offer a promising alternative to autoregressive generation through parallel decoding, but they suffer from a sampling wall: once categorical sampling occurs, rich distributional information collapses into one-hot vectors and cannot be propagated across steps, forcing subsequent steps to operate with limited information. To mitigate this problem, we introduce Loopholing, a novel and simple mechanism that preserves this information via a deterministic latent pathway, leading to Loopholing Discrete Diffusion Models (LDDMs). Trained efficiently with a self-conditioning strategy, LDDMs achieve substantial gains-reducing generative perplexity by up to 61% over prior baselines, closing (and in some cases surpassing) the gap with autoregressive models, and producing more coherent text. Applied to reasoning tasks, LDDMs also improve performance on arithmetic benchmarks such as Countdown and Game of 24. These results also indicate that loopholing mitigates idle steps and oscillations, providing a scalable path toward high-quality non-autoregressive text generation.
PDF232December 2, 2025