扩散语言模型研究综述
A Survey on Diffusion Language Models
August 14, 2025
作者: Tianyi Li, Mingda Chen, Bowei Guo, Zhiqiang Shen
cs.AI
摘要
扩散语言模型(DLMs)正迅速崛起,成为主导性自回归(AR)范式的有力且前景广阔的替代方案。通过迭代去噪过程并行生成词元,DLMs在降低推理延迟和捕捉双向上下文方面具有天然优势,从而实现对生成过程的精细控制。在实现数倍加速的同时,最新进展使DLMs展现出与自回归模型相媲美的性能,使其成为各种自然语言处理任务的理想选择。本综述全面概述了当前DLM的发展现状。我们追溯其演变历程及其与自回归和掩码语言模型等其他范式的关系,涵盖基础原理和尖端模型。我们的工作提供了最新的、全面的分类体系,并对当前技术进行了深入分析,从预训练策略到先进的训练后方法。本综述的另一贡献是对DLM推理策略和优化的详尽回顾,包括解码并行性、缓存机制和生成质量的改进。我们还重点介绍了DLM多模态扩展的最新方法,并勾勒了它们在不同实际场景中的应用。此外,我们的讨论还涉及DLM的局限性和挑战,包括效率、长序列处理和基础设施需求,同时概述了未来研究方向,以维持这一快速发展领域的进步。项目GitHub地址为https://github.com/VILA-Lab/Awesome-DLMs。
English
Diffusion Language Models (DLMs) are rapidly emerging as a powerful and
promising alternative to the dominant autoregressive (AR) paradigm. By
generating tokens in parallel through an iterative denoising process, DLMs
possess inherent advantages in reducing inference latency and capturing
bidirectional context, thereby enabling fine-grained control over the
generation process. While achieving a several-fold speed-up, recent
advancements have allowed DLMs to show performance comparable to their
autoregressive counterparts, making them a compelling choice for various
natural language processing tasks. In this survey, we provide a holistic
overview of the current DLM landscape. We trace its evolution and relationship
with other paradigms, such as autoregressive and masked language models, and
cover both foundational principles and state-of-the-art models. Our work offers
an up-to-date, comprehensive taxonomy and an in-depth analysis of current
techniques, from pre-training strategies to advanced post-training methods.
Another contribution of this survey is a thorough review of DLM inference
strategies and optimizations, including improvements in decoding parallelism,
caching mechanisms, and generation quality. We also highlight the latest
approaches to multimodal extensions of DLMs and delineate their applications
across various practical scenarios. Furthermore, our discussion addresses the
limitations and challenges of DLMs, including efficiency, long-sequence
handling, and infrastructure requirements, while outlining future research
directions to sustain progress in this rapidly evolving field. Project GitHub
is available at https://github.com/VILA-Lab/Awesome-DLMs.