DINGO：面向扩散式大语言模型的约束推理

摘要

扩散式大语言模型（Diffusion LLMs）作为传统自回归大语言模型的有力替代方案崭露头角，展现出显著提升运行时效率的潜力。然而，现有扩散模型无法可证明地强制执行用户指定的形式约束，如正则表达式，这使得它们在需要结构化输出的任务（如固定模式JSON生成）中显得不可靠。与自回归模型逐词生成不同，扩散式大语言模型并行预测一组词元。这种并行性使得传统的约束解码算法——专为顺序词元预测设计——在保持真实输出分布方面效果不佳。为解决这一局限，我们提出了DINGO，一种基于动态规划的高效且可证明保持分布的约束解码策略。DINGO能够在模型预测分布下以最高概率采样输出字符串，同时严格满足任何用户指定的正则表达式。在标准符号数学和JSON生成基准测试中，DINGO相比无约束推理实现了高达68个百分点的性能提升。

English

Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model's predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference

DINGO：面向扩散式大语言模型的约束推理

DINGO: Constrained Inference for Diffusion LLMs

摘要

Support