dLLM:簡易擴散語言建模
dLLM: Simple Diffusion Language Modeling
February 26, 2026
作者: Zhanhui Zhou, Lingjie Chen, Hanghang Tong, Dawn Song
cs.AI
摘要
儘管擴散語言模型(DLM)發展迅速,但近期多數模型均採用一組共享組件。然而這些組件散見於各臨時研究代碼庫,或缺乏透明實現方案,導致難以復現或擴展。隨著領域加速發展,業界迫切需要一個統一框架,既能標準化這些通用組件,又保持足夠靈活性以支持新方法與架構。
為此我們推出dLLM——一個開源框架,整合了擴散語言建模的核心組件(訓練、推理與評估),並使其能輕鬆自定義以適應新設計。通過dLLM,用戶可經標準化流程復現、微調、部署及評估開源大型DLM(如LLaDA和Dream)。該框架還提供極簡可復現方案,使研究者能利用可獲取的算力從頭構建小型DLM,包括將任意BERT風格編碼器或自回歸語言模型轉換為DLM。我們同時發布這些小型DLM的檢查點,以提升DLM的易用性並加速未來研究。
English
Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures.
To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling -- training, inference, and evaluation -- and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline. The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute, including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.