大型语言模型作为规划领域生成器
Large Language Models as Planning Domain Generators
April 2, 2024
作者: James Oswald, Kavitha Srinivas, Harsha Kokel, Junkyu Lee, Michael Katz, Shirin Sohrabi
cs.AI
摘要
在AI规划中,开发领域模型是仅存的几个需要人工劳动的地方之一。因此,为了使规划更易于访问,自动化领域模型生成的过程是可取的。为此,我们研究了大型语言模型(LLMs)是否可以用于从简单的文本描述中生成规划领域模型。具体来说,我们引入了一个框架,通过比较领域实例的计划集来自动评估LLM生成的领域。最后,我们对7个大型语言模型进行了实证分析,包括跨9个不同规划领域的编码和聊天模型,以及三类自然语言领域描述。我们的结果表明,LLMs,特别是参数数量较高的模型,表现出从自然语言描述中生成正确规划领域的中等水平的熟练程度。我们的代码可在 https://github.com/IBM/NL2PDDL 找到。
English
Developing domain models is one of the few remaining places that require
manual human labor in AI planning. Thus, in order to make planning more
accessible, it is desirable to automate the process of domain model generation.
To this end, we investigate if large language models (LLMs) can be used to
generate planning domain models from simple textual descriptions. Specifically,
we introduce a framework for automated evaluation of LLM-generated domains by
comparing the sets of plans for domain instances. Finally, we perform an
empirical analysis of 7 large language models, including coding and chat models
across 9 different planning domains, and under three classes of natural
language domain descriptions. Our results indicate that LLMs, particularly
those with high parameter counts, exhibit a moderate level of proficiency in
generating correct planning domains from natural language descriptions. Our
code is available at https://github.com/IBM/NL2PDDL.Summary
AI-Generated Summary