PlaSma：将小型语言模型改进为更好的程序化知识模型，用于（反事实）规划。

摘要

程序规划涉及将高层目标分解为一系列按时间顺序排列的步骤，对机器来说是一项重要且复杂的任务。它涉及整合常识知识，以推理处理通常是反事实的复杂情境，例如“在没有电话的情况下安排医生预约”。虽然当前方法利用大型语言模型（LLMs）显示出令人鼓舞的结果，但受制于诸如昂贵的API调用和可复现性问题等缺点。在本文中，我们主张利用较小的语言模型进行规划。我们提出PlaSma，一种新颖的双管齐下方法，赋予小型语言模型程序化知识和（反事实）规划能力。更具体地说，我们开发了符号化程序化知识蒸馏，以增强小型语言模型中的隐含知识，并提出了一种推理时算法，促进更有结构和准确性的推理。此外，我们引入了一项新颖任务，反事实规划，需要修订计划以应对反事实情况。在原始和反事实设置中，我们展示了数量级较小的模型（770M-11B参数）可以竞争并经常超越其更大的教师模型的能力。

English

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.

PlaSma：将小型语言模型改进为更好的程序化知识模型，用于（反事实）规划。

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

摘要

Support