고정 길이를 넘어서: 확산 기반 대형 언어 모델을 위한 가변 길이 잡음 제거

초록

확산 기반 대형 언어 모델(Diffusion Large Language Models, DLLMs)은 현재 주류를 이루는 자기회귀적 대형 언어 모델(Autoregressive Large Language Models)에 대한 강력한 대안으로 부상하고 있으며, 효율적인 병렬 생성과 전역 컨텍스트 모델링 능력을 제공합니다. 그러나 DLLMs의 실제 적용은 중요한 아키텍처적 제약으로 인해 방해를 받고 있습니다: 바로 정적으로 미리 정의된 생성 길이가 필요하다는 점입니다. 이 정적 길이 할당은 문제가 되는 트레이드오프를 초래합니다: 불충분한 길이는 복잡한 작업에서의 성능을 저하시키는 반면, 과도한 길이는 상당한 계산 오버헤드를 유발하고 때로는 성능 저하를 초래합니다. 추론 프레임워크가 경직되어 있음에도 불구하고, 우리는 모델 자체가 주어진 작업에 대한 최적 응답 길이와 상관관계가 있는 내부 신호를 가지고 있음을 관찰했습니다. 이 간극을 메우기 위해, 우리는 이러한 잠재 신호를 활용하고 DLLMs를 위한 동적 적응형 길이 확장(Dynamic Adaptive Length Expansion)을 가능하게 하는 새로운 학습 없는 디노이징 전략인 DAEDAL을 소개합니다. DAEDAL은 두 단계로 작동합니다: 1) 디노이징 과정 전에, DAEDAL은 짧은 초기 길이에서 시작하여 시퀀스 완성 메트릭을 통해 반복적으로 확장하여 작업에 적합한 대략적인 길이에 도달합니다. 2) 디노이징 과정 중에, DAEDAL은 마스크 토큰 삽입을 통해 불충분한 생성 영역을 정확히 찾아내고 확장함으로써 최종 출력이 완전히 개발되도록 합니다. DLLMs에 대한 광범위한 실험을 통해 DAEDAL이 세심하게 조정된 고정 길이 기준선과 비슷하거나 경우에 따라 더 우수한 성능을 달성하면서도, 더 높은 유효 토큰 비율을 달성하여 계산 효율성을 동시에 향상시킴을 입증했습니다. 정적 길이 제약을 해결함으로써, DAEDAL은 DLLMs의 새로운 잠재력을 개방하고, 자기회귀적 대응 모델과의 중요한 간극을 메우며, 더 효율적이고 능력 있는 생성을 위한 길을 열어줍니다.

English

Diffusion Large Language Models (DLLMs) are emerging as a powerful alternative to the dominant Autoregressive Large Language Models, offering efficient parallel generation and capable global context modeling. However, the practical application of DLLMs is hindered by a critical architectural constraint: the need for a statically predefined generation length. This static length allocation leads to a problematic trade-off: insufficient lengths cripple performance on complex tasks, while excessive lengths incur significant computational overhead and sometimes result in performance degradation. While the inference framework is rigid, we observe that the model itself possesses internal signals that correlate with the optimal response length for a given task. To bridge this gap, we leverage these latent signals and introduce DAEDAL, a novel training-free denoising strategy that enables Dynamic Adaptive Length Expansion for Diffusion Large Language Models. DAEDAL operates in two phases: 1) Before the denoising process, DAEDAL starts from a short initial length and iteratively expands it to a coarse task-appropriate length, guided by a sequence completion metric. 2) During the denoising process, DAEDAL dynamically intervenes by pinpointing and expanding insufficient generation regions through mask token insertion, ensuring the final output is fully developed. Extensive experiments on DLLMs demonstrate that DAEDAL achieves performance comparable, and in some cases superior, to meticulously tuned fixed-length baselines, while simultaneously enhancing computational efficiency by achieving a higher effective token ratio. By resolving the static length constraint, DAEDAL unlocks new potential for DLLMs, bridging a critical gap with their Autoregressive counterparts and paving the way for more efficient and capable generation.

고정 길이를 넘어서: 확산 기반 대형 언어 모델을 위한 가변 길이 잡음 제거

Beyond Fixed: Variable-Length Denoising for Diffusion Large Language Models

초록

Support