我能接收您的指令吗?基于蒙特卡洛树搜索的扩散语言模型槽位填充顺序优化
Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models
February 13, 2026
作者: Joshua Ong Jun Leang, Yu Zhao, Mihaela Cătălina Stoian, Wenda Li, Shay B. Cohen, Eleonora Giunchiglia
cs.AI
摘要
尽管掩码扩散模型中的规划填充解码方法在数学和代码推理任务中展现出潜力,但其性能对填充顺序高度敏感,常导致显著的输出差异。我们提出McDiffuSE框架,将槽位选择建模为决策过程,并通过蒙特卡洛树搜索优化填充顺序。该框架采用前瞻模拟机制,在确定填充前对部分生成结果进行评估,系统性地探索生成顺序的组合空间。实验表明,该方法相比自回归基线平均提升3.2%,较基础规划填充方法提升8.0%,在MBPP和MATH500数据集上分别取得19.5%和4.9%的显著增益。分析发现,虽然McDiffuSE主要遵循顺序生成模式,但融入非顺序生成对最大化性能至关重要。我们观察到,需要采用更大的探索常数(而非增加模拟次数)来克服模型置信度偏差并发现有效排序。这些发现确立了基于MCTS的规划作为提升掩码扩散模型生成质量的有效途径。
English
While plan-and-infill decoding in Masked Diffusion Models (MDMs) shows promise for mathematical and code reasoning, performance remains highly sensitive to slot infilling order, often yielding substantial output variance. We introduce McDiffuSE, a framework that formulates slot selection as decision making and optimises infilling orders through Monte Carlo Tree Search (MCTS). McDiffuSE uses look-ahead simulations to evaluate partial completions before commitment, systematically exploring the combinatorial space of generation orders. Experiments show an average improvement of 3.2% over autoregressive baselines and 8.0% over baseline plan-and-infill, with notable gains of 19.5% on MBPP and 4.9% on MATH500. Our analysis reveals that while McDiffuSE predominantly follows sequential ordering, incorporating non-sequential generation is essential for maximising performance. We observe that larger exploration constants, rather than increased simulations, are necessary to overcome model confidence biases and discover effective orderings. These findings establish MCTS-based planning as an effective approach for enhancing generation quality in MDMs.