범주형 생성 모델링을 위한 지속적 증강 이산 확산 모델

초록

표준 이산 확산 모델은 모든 관측되지 않은 상태를 동일하게 취급하여 이를 흡수 [MASK] 토큰으로 매핑합니다. 이는 '정보 공백'을 생성하며, 여기서 마스크되지 않은 토큰으로부터 추론될 수 있는 의미론적 정보가 노이즈 제거 단계 사이에서 손실됩니다. 우리는 Continuously Augmented Discrete Diffusion(CADD)을 소개합니다. 이 프레임워크는 이산 상태 공간을 연속 잠재 공간에서의 짝을 이루는 확산으로 보강합니다. 이를 통해 마스크된 토큰이 붕괴된 '정보 공백'이 아닌 노이즈가 있지만 정보를 담고 있는 잠재 벡터로 표현되는 점진적으로 손상된 상태를 얻을 수 있습니다. 각 역방향 단계에서 CADD는 연속 잠재 벡터를 의미론적 힌트로 활용하여 이산 노이즈 제거를 안내할 수 있습니다. 이 설계는 깔끔하며 기존의 이산 확산 학습과 호환됩니다. 샘플링 시, 연속 잠재 벡터에 대한 추정기의 강도와 선택은 모드 커버리지(다양한 출력 생성)와 모드 탐색(문맥적으로 정확한 출력 생성) 행동 사이의 균형을 제어할 수 있게 합니다. 실험적으로, 우리는 CADD가 텍스트 생성, 이미지 합성, 코드 모델링에서 마스크 기반 확산을 능가하는 생성 품질을 향상시킴을 보여줍니다. 이는 강력한 이산 기준선에 대해 질적 및 양적 지표 모두에서 일관된 개선을 보입니다.

English

Standard discrete diffusion models treat all unobserved states identically by mapping them to an absorbing [MASK] token. This creates an 'information void' where semantic information that could be inferred from unmasked tokens is lost between denoising steps. We introduce Continuously Augmented Discrete Diffusion (CADD), a framework that augments the discrete state space with a paired diffusion in a continuous latent space. This yields graded, gradually corrupted states in which masked tokens are represented by noisy yet informative latent vectors rather than collapsed 'information voids'. At each reverse step, CADD may leverage the continuous latent as a semantic hint to guide discrete denoising. The design is clean and compatible with existing discrete diffusion training. At sampling time, the strength and choice of estimator for the continuous latent vector enables a controlled trade-off between mode-coverage (generating diverse outputs) and mode-seeking (generating contextually precise outputs) behaviors. Empirically, we demonstrate CADD improves generative quality over mask-based diffusion across text generation, image synthesis, and code modeling, with consistent gains on both qualitative and quantitative metrics against strong discrete baselines.

범주형 생성 모델링을 위한 지속적 증강 이산 확산 모델

Continuously Augmented Discrete Diffusion model for Categorical Generative Modeling

초록

Support