离散马尔可夫桥

摘要

离散扩散模型近期已成为离散数据建模领域的一个前沿范式。然而，现有方法在训练过程中通常依赖于固定的转移速率矩阵，这不仅限制了潜在表征的表达能力——这是变分方法的核心优势之一，也制约了整体设计空间的拓展。针对这些局限，我们提出了离散马尔可夫桥（Discrete Markov Bridge），一个专为离散表征学习设计的新颖框架。该框架基于两大核心组件构建：矩阵学习与评分学习。我们进行了严谨的理论分析，为矩阵学习建立了正式的性能保证，并证明了整个框架的收敛性。此外，我们还分析了该方法的空间复杂度，解决了先前研究中指出的实际限制问题。广泛的实证评估验证了所提离散马尔可夫桥的有效性，其在Text8数据集上实现了1.38的证据下界（ELBO），超越了现有基准模型。同时，该模型在CIFAR-10数据集上也展现了竞争力，取得了与专门针对图像生成的先进方法相媲美的成果。

English

Discrete diffusion has recently emerged as a promising paradigm in discrete data modeling. However, existing methods typically rely on a fixed rate transition matrix during training, which not only limits the expressiveness of latent representations, a fundamental strength of variational methods, but also constrains the overall design space. To address these limitations, we propose Discrete Markov Bridge, a novel framework specifically designed for discrete representation learning. Our approach is built upon two key components: Matrix Learning and Score Learning. We conduct a rigorous theoretical analysis, establishing formal performance guarantees for Matrix Learning and proving the convergence of the overall framework. Furthermore, we analyze the space complexity of our method, addressing practical constraints identified in prior studies. Extensive empirical evaluations validate the effectiveness of the proposed Discrete Markov Bridge, which achieves an Evidence Lower Bound (ELBO) of 1.38 on the Text8 dataset, outperforming established baselines. Moreover, the proposed model demonstrates competitive performance on the CIFAR-10 dataset, achieving results comparable to those obtained by image-specific generation approaches.