大型语言模型作为规划领域生成器

摘要

在AI规划中，开发领域模型是仅存的几个需要人工劳动的地方之一。因此，为了使规划更易于访问，自动化领域模型生成的过程是可取的。为此，我们研究了大型语言模型（LLMs）是否可以用于从简单的文本描述中生成规划领域模型。具体来说，我们引入了一个框架，通过比较领域实例的计划集来自动评估LLM生成的领域。最后，我们对7个大型语言模型进行了实证分析，包括跨9个不同规划领域的编码和聊天模型，以及三类自然语言领域描述。我们的结果表明，LLMs，特别是参数数量较高的模型，表现出从自然语言描述中生成正确规划领域的中等水平的熟练程度。我们的代码可在 https://github.com/IBM/NL2PDDL 找到。

English

Developing domain models is one of the few remaining places that require manual human labor in AI planning. Thus, in order to make planning more accessible, it is desirable to automate the process of domain model generation. To this end, we investigate if large language models (LLMs) can be used to generate planning domain models from simple textual descriptions. Specifically, we introduce a framework for automated evaluation of LLM-generated domains by comparing the sets of plans for domain instances. Finally, we perform an empirical analysis of 7 large language models, including coding and chat models across 9 different planning domains, and under three classes of natural language domain descriptions. Our results indicate that LLMs, particularly those with high parameter counts, exhibit a moderate level of proficiency in generating correct planning domains from natural language descriptions. Our code is available at https://github.com/IBM/NL2PDDL.

大型语言模型作为规划领域生成器

Large Language Models as Planning Domain Generators

摘要

Summary

Support

Support