ChatPaper.aiChatPaper

超越固定長度:擴散式大語言模型的變長度去噪技術

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

August 1, 2025
作者: Jinsong Li, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Jiaqi Wang, Dahua Lin
cs.AI

摘要

擴散式大型語言模型(DLLMs)正逐漸成為主導的自回歸大型語言模型的有力替代方案,提供高效的並行生成能力和強大的全局上下文建模能力。然而,DLLMs的實際應用受到一個關鍵架構限制的阻礙:需要靜態預定義的生成長度。這種靜態長度分配導致了一個棘手的權衡:長度不足會嚴重影響複雜任務的性能,而過長的長度則會帶來顯著的計算開銷,有時甚至導致性能下降。儘管推理框架是固定的,我們觀察到模型本身具有與特定任務最佳回應長度相關的內部信號。為彌補這一差距,我們利用這些潛在信號,引入了DAEDAL,這是一種新穎的免訓練去噪策略,實現了擴散式大型語言模型的動態自適應長度擴展。DAEDAL分兩個階段運作:1)在去噪過程之前,DAEDAL從一個較短的初始長度開始,並根據序列完成度指標迭代擴展至粗略的任務適宜長度。2)在去噪過程中,DAEDAL通過插入掩碼標記來精確定位並擴展生成不足的區域,確保最終輸出完全發展。在DLLMs上的大量實驗表明,DAEDAL的性能與精心調校的固定長度基線相當,在某些情況下甚至更優,同時通過實現更高的有效標記比率來提升計算效率。通過解決靜態長度限制,DAEDAL釋放了DLLMs的新潛力,彌補了與自回歸模型之間的關鍵差距,為更高效、更強大的生成鋪平了道路。
English
Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical application of DLLMs is hindered by a critical architectural constraint: the need for a statically predefined generation length. This static length allocation leads to a problematic trade-off: insufficient lengths cripple performance on complex tasks, while excessive lengths incur significant computational overhead and sometimes result in performance degradation. While the inference framework is rigid, we observe that the model itself possesses internal signals that correlate with the optimal response length for a given task. To bridge this gap, we leverage these latent signals and introduce DAEDAL, a novel training-free denoising strategy that enables Dynamic Adaptive Length Expansion for Diffusion Large Language Models. DAEDAL operates in two phases: 1) Before the denoising process, DAEDAL starts from a short initial length and iteratively expands it to a coarse task-appropriate length, guided by a sequence completion metric. 2) During the denoising process, DAEDAL dynamically intervenes by pinpointing and expanding insufficient generation regions through mask token insertion, ensuring the final output is fully developed. Extensive experiments on DLLMs demonstrate that DAEDAL achieves performance comparable, and in some cases superior, to meticulously tuned fixed-length baselines, while simultaneously enhancing computational efficiency by achieving a higher effective token ratio. By resolving the static length constraint, DAEDAL unlocks new potential for DLLMs, bridging a critical gap with their Autoregressive counterparts and paving the way for more efficient and capable generation.
PDF532August 4, 2025