이산 마르코프 브리지

초록

이산 데이터 모델링에서 이산 확산(Discrete Diffusion)은 최근 주목받는 패러다임으로 부상했습니다. 그러나 기존 방법들은 일반적으로 학습 과정에서 고정된 전이 행렬을 사용하는데, 이는 변분 방법의 근본적인 강점인 잠재 표현의 표현력을 제한할 뿐만 아니라 전체 설계 공간을 제약합니다. 이러한 한계를 해결하기 위해, 우리는 이산 표현 학습을 위해 특별히 설계된 새로운 프레임워크인 Discrete Markov Bridge를 제안합니다. 우리의 접근 방식은 행렬 학습(Matrix Learning)과 스코어 학습(Score Learning)이라는 두 가지 핵심 구성 요소를 기반으로 합니다. 우리는 엄밀한 이론적 분석을 수행하여 행렬 학습에 대한 공식적인 성능 보장을 확립하고 전체 프레임워크의 수렴성을 증명합니다. 또한, 우리는 이전 연구에서 확인된 실용적인 제약 사항을 해결하기 위해 우리 방법의 공간 복잡도를 분석합니다. 광범위한 실험적 평가를 통해 제안된 Discrete Markov Bridge의 효과성을 검증하였으며, Text8 데이터셋에서 1.38의 Evidence Lower Bound(ELBO)를 달성하여 기존 베이스라인을 능가하는 성과를 보였습니다. 더불어, 제안된 모델은 CIFAR-10 데이터셋에서도 경쟁력 있는 성능을 보이며, 이미지 특화 생성 접근법들과 비슷한 결과를 얻었습니다.

English

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.