傅里叶采样器:通过频率引导生成解锁扩散语言模型的非自回归潜力
FourierSampler: Unlocking Non-Autoregressive Potential in Diffusion Language Models via Frequency-Guided Generation
January 30, 2026
作者: Siyang He, Qiqi Wang, Xiaoran Liu, Hongnan Ma, Yiwei Shi, Yuerong Song, Ying Zhu, Tianyi Liang, Zengfeng Huang, Ziwei He, Xipeng Qiu
cs.AI
摘要
尽管扩散语言模型(dLLMs)具备非自回归生成的潜力,但现有解码策略存在位置偏差,未能充分发挥其任意生成的潜能。本研究深入探索了dLLMs的内在频谱特性,首次通过频域分析揭示:隐藏状态中的低频分量主要编码全局结构信息和长程依赖关系,而高频分量则负责刻画局部细节特征。基于此发现,我们提出FourierSampler——通过频域滑动窗口机制动态引导模型实现"从结构到细节"的生成方式。在LLaDA和SDAR基准测试中,FourierSampler显著优于其他推理增强策略,在LLaDA1.5-8B和LLaDA-8B-Instruct模型上分别实现20.4%和16.0%的相对提升,其表现甚至明显超越同等规模的自回归模型(如Llama3.1-8B-Instruct)。
English
Despite the non-autoregressive potential of diffusion language models (dLLMs), existing decoding strategies demonstrate positional bias, failing to fully unlock the potential of arbitrary generation. In this work, we delve into the inherent spectral characteristics of dLLMs and present the first frequency-domain analysis showing that low-frequency components in hidden states primarily encode global structural information and long-range dependencies, while high-frequency components are responsible for characterizing local details. Based on this observation, we propose FourierSampler, which leverages a frequency-domain sliding window mechanism to dynamically guide the model to achieve a "structure-to-detail" generation. FourierSampler outperforms other inference enhancement strategies on LLADA and SDAR, achieving relative improvements of 20.4% on LLaDA1.5-8B and 16.0% on LLaDA-8B-Instruct. It notably surpasses similarly sized autoregressive models like Llama3.1-8B-Instruct.