离散扩散中的摊销序贯蒙特卡洛的对比分布匹配

摘要

离散扩散模型已成为生成结构化分类数据的强大框架。然而，从奖励偏斜分布中高效采样仍是一项根本性挑战。尽管扭曲序贯蒙特卡罗（SMC）方法在该任务上具有渐近精确性，但在离散状态空间中估计最优扭曲函数需要昂贵的蒙特卡罗近似，导致推理过程中出现严重的计算瓶颈。为克服这一局限，我们提出了对比分布匹配（CDM）这一新型框架，通过正负样本学习参数化扭曲函数，从而分摊SMC推理的成本。为实现高效训练，我们重新构建了梯度估计器，以利用离散扩散模型的闭式前向核。在实际中，评估学习到的扭曲函数相较于基础模型的单次前向传播，仅增加不到5%的计算开销。通过大量实证评估，我们证明在相同实际运行时间下，CDM始终优于现有基线方法。我们验证了该方法在多种应用中的有效性和通用性，包括有害文本生成、调控DNA序列设计、蛋白质可设计性以及扩散大语言模型对齐。

English

Discrete diffusion models have emerged as powerful frameworks for generating structured categorical data. However, efficiently sampling from reward-tilted distributions remains a fundamental challenge. While Twisted Sequential Monte Carlo (SMC) offers asymptotic exactness for this task, estimating the optimal twist function in discrete state spaces necessitates costly Monte Carlo approximations, resulting a severe computational bottleneck at inference. To overcome this limitation, we introduce Contrastive Distribution Matching (CDM), a novel framework that amortizes the cost of SMC inference by learning a parameterized twist function via positive and negative samples. For efficient training, we reformulate the gradient estimator to leverage the closed-form forward kernels of discrete diffusion models. In practice, evaluating our learned twist function incurs less than 5% additional computational overhead compared to a single forward pass of the base model. Through extensive empirical evaluations, we demonstrate that CDM consistently outperforms existing baselines under matched wall-clock time. We validate the effectiveness and versatility of our approach across a diverse range of applications, including toxic text generation, regulatory DNA sequence design, protein designability, and diffusion large language model alignment.