확산 언어 모델에 관한 연구 동향 분석

초록

확산 언어 모델(Diffusion Language Models, DLMs)은 현재 지배적인 자기회귀(autoregressive, AR) 패러다임에 대한 강력하고 유망한 대안으로 빠르게 부상하고 있습니다. DLM은 반복적인 노이즈 제거 과정을 통해 토큰을 병렬로 생성함으로써, 추론 지연 시간을 줄이고 양방향 문맥을 포착하는 데 있어 본질적인 장점을 가지고 있어 생성 과정에 대한 세밀한 제어가 가능합니다. 몇 배의 속도 향상을 달성하면서도, 최근의 발전으로 인해 DLM은 자기회귀 모델과 비슷한 성능을 보여주어 다양한 자연어 처리 작업에서 매력적인 선택지로 자리 잡고 있습니다. 본 논문에서는 현재 DLM의 전반적인 현황을 종합적으로 살펴봅니다. 우리는 DLM의 진화와 자기회귀 모델 및 마스크 언어 모델과의 관계를 추적하고, 기초 원리부터 최신 모델까지 폭넓게 다룹니다. 이 연구는 최신의 포괄적인 분류 체계와 사전 학습 전략부터 고급 사후 학습 방법에 이르기까지 현재의 기술에 대한 심층 분석을 제공합니다. 또한, 이 논문의 또 다른 기여는 DLM 추론 전략과 최적화에 대한 철저한 검토로, 디코딩 병렬화, 캐싱 메커니즘, 생성 품질 개선 등을 포함합니다. 우리는 또한 DLM의 다중 모달 확장에 대한 최신 접근법을 강조하고 다양한 실제 시나리오에서의 응용을 명확히 설명합니다. 더 나아가, 우리의 논의는 DLM의 효율성, 장문 처리, 인프라 요구 사항 등의 한계와 도전 과제를 다루며, 이 빠르게 진화하는 분야에서의 지속적인 발전을 위한 미래 연구 방향을 제시합니다. 프로젝트 GitHub는 https://github.com/VILA-Lab/Awesome-DLMs에서 확인할 수 있습니다.

English

Diffusion Language Models (DLMs) are rapidly emerging as a powerful and promising alternative to the dominant autoregressive (AR) paradigm. By generating tokens in parallel through an iterative denoising process, DLMs possess inherent advantages in reducing inference latency and capturing bidirectional context, thereby enabling fine-grained control over the generation process. While achieving a several-fold speed-up, recent advancements have allowed DLMs to show performance comparable to their autoregressive counterparts, making them a compelling choice for various natural language processing tasks. In this survey, we provide a holistic overview of the current DLM landscape. We trace its evolution and relationship with other paradigms, such as autoregressive and masked language models, and cover both foundational principles and state-of-the-art models. Our work offers an up-to-date, comprehensive taxonomy and an in-depth analysis of current techniques, from pre-training strategies to advanced post-training methods. Another contribution of this survey is a thorough review of DLM inference strategies and optimizations, including improvements in decoding parallelism, caching mechanisms, and generation quality. We also highlight the latest approaches to multimodal extensions of DLMs and delineate their applications across various practical scenarios. Furthermore, our discussion addresses the limitations and challenges of DLMs, including efficiency, long-sequence handling, and infrastructure requirements, while outlining future research directions to sustain progress in this rapidly evolving field. Project GitHub is available at https://github.com/VILA-Lab/Awesome-DLMs.

확산 언어 모델에 관한 연구 동향 분석

A Survey on Diffusion Language Models

초록

Support