PlaSma: 소형 언어 모델을 (반사실적) 계획을 위한 더 나은 절차적 지식 모델로 개선하기

초록

절차적 계획(Procedural planning)은 상위 수준의 목표를 시간 순서로 정렬된 일련의 단계로 분해하는 작업으로, 기계에게 있어 중요하면서도 복잡한 과제입니다. 이는 상식 지식을 통합하여 종종 반사실적(counterfactual)인 복잡한 상황적 맥락을 추론하는 것을 포함합니다. 예를 들어, "전화 없이 의사 예약을 잡는 것"과 같은 상황이 여기에 해당합니다. 현재의 접근 방식은 대형 언어 모델(LLMs)을 사용하여 고무적인 결과를 보여주고 있지만, 비용이 많이 드는 API 호출 및 재현성 문제와 같은 단점에 직면해 있습니다. 본 논문에서는 더 작은 언어 모델을 사용한 계획을 주장합니다. 우리는 PlaSma라는 새로운 이중 접근 방식을 제시하여, 작은 언어 모델에 절차적 지식과 (반사실적) 계획 능력을 부여합니다. 구체적으로, 우리는 작은 언어 모델의 암묵적 지식을 강화하기 위한 기호적 절차적 지식 증류(symbolic procedural knowledge distillation)와 더 구조적이고 정확한 추론을 촉진하기 위한 추론 시점 알고리즘을 개발했습니다. 또한, 반사실적 상황에 대처하기 위해 계획을 수정해야 하는 새로운 과제인 반사실적 계획(Counterfactual Planning)을 소개합니다. 원래 설정과 반사실적 설정 모두에서, 우리는 크기가 훨씬 작은 모델(770M-11B 파라미터)이 더 큰 교사 모델의 능력을 따라잡고 종종 능가할 수 있음을 보여줍니다.

English

Procedural planning, which entails decomposing a high-level goal into a sequence of temporally ordered steps, is an important yet intricate task for machines. It involves integrating common-sense knowledge to reason about complex contextualized situations that are often counterfactual, e.g. "scheduling a doctor's appointment without a phone". While current approaches show encouraging results using large language models (LLMs), they are hindered by drawbacks such as costly API calls and reproducibility issues. In this paper, we advocate planning using smaller language models. We present PlaSma, a novel two-pronged approach to endow small language models with procedural knowledge and (counterfactual) planning capabilities. More concretely, we develop symbolic procedural knowledge distillation to enhance the implicit knowledge in small language models and an inference-time algorithm to facilitate more structured and accurate reasoning. In addition, we introduce a novel task, Counterfactual Planning, that requires a revision of a plan to cope with a counterfactual situation. In both the original and counterfactual setting, we show that orders-of-magnitude smaller models (770M-11B parameters) can compete and often surpass their larger teacher models' capabilities.

PlaSma: 소형 언어 모델을 (반사실적) 계획을 위한 더 나은 절차적 지식 모델로 개선하기

PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning

초록

Support