スナップショットからの一般化離散拡散

要旨

本論文では、大規模な離散状態空間における任意のノイズ付加プロセスをサポートする、離散拡散モデリングの統一フレームワークであるGeneralized Discrete Diffusion from Snapshots（GDDS）を提案する。本定式化は既存のすべての離散拡散手法を包含しつつ、劣化ダイナミクスの選択において大幅に高い柔軟性を可能にする。順方向ノイズ付加プロセスは均一化法に基づき、高速な任意の劣化を実現する。逆プロセスについては、ノイズ付加経路全体ではなくスナップショット潜在変数に基づく単純なエビデンス下界（ELBO）を導出し、明確な確率的解釈を持つ標準的な生成モデリングアーキテクチャの効率的な学習を可能にする。大規模語彙における離散生成タスクでの実験結果から、提案フレームワークは学習効率と生成品質の面で既存の離散拡散手法を上回り、この規模では初めて自己回帰モデルを凌駕することが示された。コードおよびブログ記事はプロジェクトページ（https://oussamazekri.fr/gdds）で公開している。

English

We introduce Generalized Discrete Diffusion from Snapshots (GDDS), a unified framework for discrete diffusion modeling that supports arbitrary noising processes over large discrete state spaces. Our formulation encompasses all existing discrete diffusion approaches, while allowing significantly greater flexibility in the choice of corruption dynamics. The forward noising process relies on uniformization and enables fast arbitrary corruption. For the reverse process, we derive a simple evidence lower bound (ELBO) based on snapshot latents, instead of the entire noising path, that allows efficient training of standard generative modeling architectures with clear probabilistic interpretation. Our experiments on large-vocabulary discrete generation tasks suggest that the proposed framework outperforms existing discrete diffusion methods in terms of training efficiency and generation quality, and beats autoregressive models for the first time at this scale. We provide the code along with a blog post on the project page : https://oussamazekri.fr/gdds{https://oussamazekri.fr/gdds}.