dLLM: 단순 확산 언어 모델링

초록

확산 언어 모델(DLM)이 빠르게 발전하고 있지만, 많은 최신 모델들은 공통된 구성 요소 집합으로 수렴하는 추세를 보입니다. 그러나 이러한 구성 요소들은 특정 연구용 코드베이스에 분산되어 있거나 투명한 구현이 부족하여 재현이나 확장이 어려운 실정입니다. 해당 분야가 가속화됨에 따라, 새로운 방법론과 아키텍처를 지원할 수 있을 만큼 유연성을 유지하면서 이러한 공통 구성 요소를 표준화하는 통합 프레임워크의 필요성이 대두되고 있습니다. 이러한 격차를 해결하기 위해 우리는 dLLM이라는 오픈소스 프레임워크를 소개합니다. dLLM은 확산 언어 모델링의 핵심 구성 요소인 훈련, 추론, 평가를 통합하고 새로운 설계를 위해 사용자 정의하기 쉽게 만들어 줍니다. dLLM을 통해 사용자는 LLaDA나 Dream과 같은 오픈소스 대형 DLM을 표준화된 파이프라인을 통해 재현, 미세 조정, 배포, 평가할 수 있습니다. 또한 본 프레임워크는 접근 가능한 컴퓨팅 자원으로 소규모 DLM을 처음부터 구축하기 위한 최소한의 재현 가능한 레시피를 제공하며, 여기에는 BERT 스타일 인코더나 자회귀 언어 모델을 DLM으로 변환하는 방법도 포함됩니다. 우리는 DLM의 접근성을 높이고 향후 연구를 가속화하기 위해 이러한 소규모 DLM의 체크포인트도 공개합니다.

English

Although diffusion language models (DLMs) are evolving quickly, many recent models converge on a set of shared components. These components, however, are distributed across ad-hoc research codebases or lack transparent implementations, making them difficult to reproduce or extend. As the field accelerates, there is a clear need for a unified framework that standardizes these common components while remaining flexible enough to support new methods and architectures. To address this gap, we introduce dLLM, an open-source framework that unifies the core components of diffusion language modeling -- training, inference, and evaluation -- and makes them easy to customize for new designs. With dLLM, users can reproduce, finetune, deploy, and evaluate open-source large DLMs such as LLaDA and Dream through a standardized pipeline. The framework also provides minimal, reproducible recipes for building small DLMs from scratch with accessible compute, including converting any BERT-style encoder or autoregressive LM into a DLM. We also release the checkpoints of these small DLMs to make DLMs more accessible and accelerate future research.

dLLM: 단순 확산 언어 모델링

dLLM: Simple Diffusion Language Modeling

초록

Support