ドメイン固有言語生成のための文法プロンプティング：大規模言語モデルの活用

要旨

大規模言語モデル（LLM）は、わずかな文脈内の例から多様な自然言語タスクを学習することが可能です。しかし、高度に構造化された言語（例えば、複雑なドメイン固有言語への意味解析）から文字列を生成する場合、LLMが少数の例から一般化することは困難です。本研究では、文法プロンプティングというシンプルなアプローチを探求し、LLMが外部知識とドメイン固有の制約を利用できるようにします。これらは、Backus-Naur形式（BNF）で表現された文法を通じて、文脈内学習中に適用されます。文法プロンプティングは、各デモンストレーション例を、特定の出力例を生成するために最小限に必要な専門文法で拡張します。ここで、専門文法は完全なドメイン固有言語（DSL）文法のサブセットです。推論時には、LLMはまずテスト入力に基づいてBNF文法を予測し、その後、その文法の規則に従って出力を生成します。実験結果は、文法プロンプティングがLLMに多様なDSL生成タスク（意味解析：SMCalFlow、Overnight、GeoQuery、PDDLプランニング、さらには分子生成：SMILES）で競争力のある性能を発揮させることを示しています。

English

Large language models (LLMs) can learn to perform a wide range of natural language tasks from just a handful of in-context examples. However, for generating strings from highly structured languages (e.g., semantic parsing to complex domain-specific languages), it is challenging for the LLM to generalize from just a few exemplars. We explore grammar prompting as a simple approach for enabling LLMs to use external knowledge and domain-specific constraints, expressed through a grammar expressed in Backus--Naur Form (BNF), during in-context learning. Grammar prompting augments each demonstration example with a specialized grammar that is minimally sufficient for generating the particular output example, where the specialized grammar is a subset of the full DSL grammar. For inference, the LLM first predicts a BNF grammar given a test input, and then generates the output according to the rules of the grammar. Experiments demonstrate that grammar prompting can enable LLMs to perform competitively on a diverse set of DSL generation tasks, including semantic parsing (SMCalFlow, Overnight, GeoQuery), PDDL planning, and even molecule generation (SMILES).

ドメイン固有言語生成のための文法プロンプティング：大規模言語モデルの活用

Grammar Prompting for Domain-Specific Language Generation with Large Language Models

要旨

Support