離散馬可夫橋

摘要

離散擴散最近已成為離散數據建模中一個頗具前景的範式。然而，現有方法通常依賴於訓練期間固定的轉移矩陣，這不僅限制了潛在表示的表達能力——這是變分方法的一個基本優勢——也約束了整體的設計空間。為解決這些限制，我們提出了離散馬爾可夫橋（Discrete Markov Bridge），這是一個專門為離散表示學習設計的新框架。我們的方法建立在兩個關鍵組件之上：矩陣學習和分數學習。我們進行了嚴謹的理論分析，為矩陣學習建立了正式的性能保證，並證明了整個框架的收斂性。此外，我們分析了我們方法的空間複雜度，解決了先前研究中識別的實際約束。大量的實證評估驗證了所提出的離散馬爾可夫橋的有效性，其在Text8數據集上達到了1.38的證據下界（ELBO），超越了現有的基線方法。此外，所提出的模型在CIFAR-10數據集上也展示了競爭力的性能，取得了與專為圖像生成設計的方法相當的結果。

English

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.