DINGO：擴散式大型語言模型的約束推論

摘要

扩散式大语言模型（Diffusion LLMs）作为一种有前景的替代方案，相较于传统的自回归大语言模型，展现出显著的运行时效率提升潜力。然而，现有的扩散模型无法可证明地强制执行用户指定的形式约束，如正则表达式，这使得它们在需要结构化输出的任务（如固定模式的JSON生成）中显得不可靠。与自回归模型逐词生成不同，扩散式大语言模型并行预测一组词元。这种并行性使得传统的约束解码算法——专为顺序词元预测设计——在保持真实输出分布方面效果不佳。针对这一局限，我们提出了DINGO，一种基于动态规划的高效且可证明保持分布的约束解码策略。DINGO能够在严格满足用户指定正则表达式的同时，从模型预测分布中采样出概率最高的输出字符串。在标准的符号数学和JSON生成基准测试中，DINGO相较于无约束推理，实现了高达68个百分点的性能提升。

English

Diffusion LLMs have emerged as a promising alternative to conventional autoregressive LLMs, offering significant potential for improved runtime efficiency. However, existing diffusion models lack the ability to provably enforce user-specified formal constraints, such as regular expressions, which makes them unreliable for tasks that require structured outputs, such as fixed-schema JSON generation. Unlike autoregressive models that generate tokens sequentially, diffusion LLMs predict a block of tokens in parallel. This parallelism makes traditional constrained decoding algorithms, which are designed for sequential token prediction, ineffective at preserving the true output distribution. To address this limitation, we propose DINGO, a dynamic programming-based constrained decoding strategy that is both efficient and provably distribution-preserving. DINGO enables sampling of output strings with the highest probability under the model's predicted distribution, while strictly satisfying any user-specified regular expression. On standard symbolic math and JSON generation benchmarks, DINGO achieves up to a 68 percentage point improvement over unconstrained inference

DINGO：擴散式大型語言模型的約束推論

DINGO: Constrained Inference for Diffusion LLMs

摘要

Support