基于快照的广义离散扩散模型

摘要

我们提出广义离散扩散快照法（GDDS），这是一个支持大规模离散状态空间任意噪声化过程的统一离散扩散建模框架。该框架不仅涵盖现有所有离散扩散方法，还在噪声动态选择上具有显著更高的灵活性。前向噪声化过程基于均匀化理论，可实现快速任意污染。对于逆向过程，我们基于快照潜变量（而非完整噪声路径）推导出简洁的证据下界（ELBO），使标准生成建模架构能够以清晰的概率解释进行高效训练。在大词汇量离散生成任务上的实验表明，所提框架在训练效率和生成质量上均优于现有离散扩散方法，并首次在此规模上超越自回归模型。相关代码及技术博客已发布于项目页面：https://oussamazekri.fr/gdds。

English

We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing discrete diffusion approaches, while allowing significantly greater flexibility in the choice of corruption dynamics. The forward noising process relies on uniformization and enables fast arbitrary corruption. For the reverse process, we derive a simple evidence lower bound (ELBO) based on snapshot latents, instead of the entire noising path, that allows efficient training of standard generative modeling architectures with clear probabilistic interpretation. Our experiments on large-vocabulary discrete generation tasks suggest that the proposed framework outperforms existing discrete diffusion methods in terms of training efficiency and generation quality, and beats autoregressive models for the first time at this scale. We provide the code along with a blog post on the project page : https://oussamazekri.fr/gdds{https://oussamazekri.fr/gdds}.