大型語言模型作為規劃領域生成器

摘要

在 AI 計畫中，發展領域模型是少數仍需要人工勞動的領域之一。因此，為了使計畫更易於存取，自動化領域模型生成的過程是可取的。為此，我們探討了是否可以利用大型語言模型（LLMs）從簡單的文本描述中生成計畫領域模型。具體而言，我們引入了一個框架，用於通過比較領域實例的計畫集來自動評估由LLM生成的領域。最後，我們對7個大型語言模型進行了實證分析，包括跨越9個不同計畫領域的編碼和聊天模型，並在三類自然語言領域描述下進行了評估。我們的結果顯示，LLMs，特別是具有高參數數量的模型，展現了從自然語言描述中生成正確計畫領域的中等水準能力。我們的程式碼可在 https://github.com/IBM/NL2PDDL 找到。

English

Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL.

大型語言模型作為規劃領域生成器

Large Language Models as Planning Domain Generators

摘要

Summary

Support

Support