ご注文を承ります：拡散言語モデルにおけるスロット埋め込み順序決定のためのモンテカルロ木探索

要旨

マスク拡散モデル（MDM）における計画埋め込みデコードは数学的推論やコード推論への応用が期待されるが、その性能はスロット埋め込み順序に大きく依存し、出力の大幅なばらつきが生じやすい。本研究では、スロット選択を意思決定問題として定式化し、モンテカルロ木探索（MCTS）を用いて埋め込み順序を最適化するフレームワークMcDiffuSEを提案する。McDiffuSEは部分的な生成結果を確定前に先読みシミュレーションで評価し、生成順序の組み合わせ空間を体系的に探索する。実験では、自己回帰ベースラインより平均3.2%、計画埋め込みベースラインより8.0%の性能向上を達成し、MBPPでは19.5%、MATH500では4.9%の顕著な改善を示した。分析により、McDiffuSEが主に逐次的な順序を採用しつつも、非逐次生成を組み込むことが性能最大化に不可欠であることが明らかになった。また、シミュレーション回数の増加よりも探索定数の拡大が、モデルの自信バイアスを克服し有効な順序を発見する上で必要であることを確認した。これらの知見は、MCTSに基づく計画立案がMDMの生成品質向上に有効な手法であることを示す。

English

While plan-and-infill decoding in Masked Diffusion Models (MDMs) shows promise for mathematical and code reasoning, performance remains highly sensitive to slot infilling order, often yielding substantial output variance. We introduce McDiffuSE, a framework that formulates slot selection as decision making and optimises infilling orders through Monte Carlo Tree Search (MCTS). McDiffuSE uses look-ahead simulations to evaluate partial completions before commitment, systematically exploring the combinatorial space of generation orders. Experiments show an average improvement of 3.2% over autoregressive baselines and 8.0% over baseline plan-and-infill, with notable gains of 19.5% on MBPP and 4.9% on MATH500. Our analysis reveals that while McDiffuSE predominantly follows sequential ordering, incorporating non-sequential generation is essential for maximising performance. We observe that larger exploration constants, rather than increased simulations, are necessary to overcome model confidence biases and discover effective orderings. These findings establish MCTS-based planning as an effective approach for enhancing generation quality in MDMs.

ご注文を承ります：拡散言語モデルにおけるスロット埋め込み順序決定のためのモンテカルロ木探索

Can I Have Your Order? Monte-Carlo Tree Search for Slot Filling Ordering in Diffusion Language Models

要旨

Support