PlaSma：讓小型語言模型成為更好的程序知識模型，用於（反事實）規劃

摘要

程序性規劃是將高層目標分解為一系列按時間順序排列的步驟，對於機器而言是一項重要且複雜的任務。它涉及整合常識知識，以推理處理通常是反事實的複雜情境，例如「在沒有手機的情況下安排醫生的預約」。儘管目前的方法利用大型語言模型（LLMs）取得了令人鼓舞的結果，但受到昂貴的應用程式介面調用和可重現性問題的阻礙。在本文中，我們主張使用較小的語言模型進行規劃。我們提出了PlaSma，一種新穎的雙管齊下方法，賦予小型語言模型程序性知識和（反事實）規劃能力。更具體地說，我們開發了符號性程序性知識蒸餾，以增強小型語言模型中的隱含知識，並提出了一種推理時間算法，以促進更有結構和準確的推理。此外，我們引入了一項新任務，反事實規劃，需要修改計劃以應對反事實情況。在原始和反事實設置中，我們展示了數量級較小的模型（770M-11B參數）可以競爭，並且通常超越其較大的教師模型的能力。

English

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.

PlaSma：讓小型語言模型成為更好的程序知識模型，用於（反事實）規劃

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

摘要

Support