ChatPaper.aiChatPaper

傅立葉採樣器:透過頻率引導生成釋放擴散語言模型中的非自迴歸潛能

FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation

January 30, 2026
作者: Siyang He, Qiqi Wang, Xiaoran Liu, Hongnan Ma, Yiwei Shi, Yuerong Song, Ying Zhu, Tianyi Liang, Zengfeng Huang, Ziwei He, Xipeng Qiu
cs.AI

摘要

儘管擴散語言模型具備非自回歸的潛力,現有解碼策略仍存在位置偏差,未能充分釋放任意生成的潛能。本研究深入探討擴散語言模型的內在頻譜特性,首次提出頻域分析,揭示隱藏狀態中的低頻分量主要編碼全局結構信息與長程依賴關係,而高頻分量則負責刻畫局部細節。基於此發現,我們提出FourierSampler,通過頻域滑動窗口機制動態引導模型實現「由結構至細節」的生成模式。在LLaDA與SDAR基準測試中,FourierSampler以LLaDA1.5-8B模型相對提升20.4%、LLaDA-8B-Instruct模型相對提升16.0%的表現優於其他推理增強策略,更顯著超越同規模自回歸模型如Llama3.1-8B-Instruct。
English
Despite the non-autoregressive potential of diffusion language models (dLLMs), existing decoding strategies demonstrate positional bias, failing to fully unlock the potential of arbitrary generation. In this work, we delve into the inherent spectral characteristics of dLLMs and present the first frequency-domain analysis showing that low-frequency components in hidden states primarily encode global structural information and long-range dependencies, while high-frequency components are responsible for characterizing local details. Based on this observation, we propose FourierSampler, which leverages a frequency-domain sliding window mechanism to dynamically guide the model to achieve a "structure-to-detail" generation. FourierSampler outperforms other inference enhancement strategies on LLADA and SDAR, achieving relative improvements of 20.4% on LLaDA1.5-8B and 16.0% on LLaDA-8B-Instruct. It notably surpasses similarly sized autoregressive models like Llama3.1-8B-Instruct.
PDF182February 3, 2026