利用大型语言模型进行领域特定语言生成的语法提示
Grammar Prompting for Domain-Specific Language Generation with Large Language Models
May 30, 2023
作者: Bailin Wang, Zi Wang, Xuezhi Wang, Yuan Cao, Rif A. Saurous, Yoon Kim
cs.AI
摘要
大型语言模型(LLMs)可以从少数上下文示例中学习执行各种自然语言任务。然而,对于生成高度结构化语言(例如,语义解析到复杂领域特定语言)的字符串,LLM很难仅通过少数示例进行泛化。我们探讨了语法提示作为一种简单方法,使LLMs能够在上下文学习过程中利用外部知识和领域特定约束,这些约束通过Backus-Naur形式(BNF)表示的语法来表达。语法提示通过将每个演示示例与一个专门的语法相结合,该语法最少地足以生成特定的输出示例,其中专门的语法是完整DSL语法的子集。对于推理,LLM首先根据测试输入预测BNF语法,然后根据语法规则生成输出。实验证明,语法提示可以使LLMs在各种DSL生成任务上表现出竞争力,包括语义解析(SMCalFlow,Overnight,GeoQuery),PDDL规划,甚至分子生成(SMILES)。
English
Large language models (LLMs) can learn to perform a wide range of natural
language tasks from just a handful of in-context examples. However, for
generating strings from highly structured languages (e.g., semantic parsing to
complex domain-specific languages), it is challenging for the LLM to generalize
from just a few exemplars. We explore grammar prompting as a simple
approach for enabling LLMs to use external knowledge and domain-specific
constraints, expressed through a grammar expressed in Backus--Naur Form (BNF),
during in-context learning. Grammar prompting augments each demonstration
example with a specialized grammar that is minimally sufficient for generating
the particular output example, where the specialized grammar is a subset of the
full DSL grammar. For inference, the LLM first predicts a BNF grammar given a
test input, and then generates the output according to the rules of the
grammar. Experiments demonstrate that grammar prompting can enable LLMs to
perform competitively on a diverse set of DSL generation tasks, including
semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and even
molecule generation (SMILES).