PlaSma: 小規模言語モデルを（反事実的）計画のための優れた手続き的知識モデルに進化させる

要旨

手続き的計画立案とは、高レベルの目標を時間的に順序付けられた一連のステップに分解することを含む重要なタスクであり、機械にとっては複雑な課題です。これには、常識的知識を統合して、しばしば反事実的な複雑な文脈状況を推論することが含まれます。例えば、「電話なしで医者の予約を入れる」といった状況です。現在のアプローチでは、大規模言語モデル（LLM）を使用して有望な結果を示していますが、高額なAPIコストや再現性の問題といった欠点があります。本論文では、より小規模な言語モデルを使用した計画立案を提唱します。私たちはPlaSmaを紹介します。これは、小規模言語モデルに手続き的知識と（反事実的）計画立案能力を付与するための新しい二段階アプローチです。具体的には、小規模言語モデルの暗黙的知識を強化するための記号的手続き的知識蒸留と、より構造化された正確な推論を促進するための推論時アルゴリズムを開発しました。さらに、反事実的状況に対応するために計画を修正することを要求する新しいタスク、反事実的計画立案を導入します。元の設定と反事実的設定の両方において、桁違いに小さいモデル（770M-11Bパラメータ）が、より大きな教師モデルの能力に匹敵し、しばしばそれを上回ることを示します。

English

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.

PlaSma: 小規模言語モデルを（反事実的）計画のための優れた手続き的知識モデルに進化させる

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

要旨

Support