PEARL: 大規模言語モデルに長文ドキュメントに対する計画立案と実行を促すプロンプティング

要旨

連鎖的思考プロンプト（chain-of-thought prompting）などの戦略は、入力例を中間ステップに分解することで、大規模言語モデル（LLM）の複雑な推論タスクにおける性能を向上させます。しかし、長い入力文書に対してこのような方法を適用し、推論を行う方法はまだ明確ではありません。特に、分解と各中間ステップの出力の両方が非自明な場合が多くあります。本研究では、長文書に対する推論を改善するためのプロンプトフレームワークであるPEARLを提案します。PEARLは、アクション抽出、計画策定、計画実行の3つの段階で構成されています。具体的には、長文書に関する質問が与えられると、PEARLはその質問を一連のアクション（例：要約、イベント検索、関係検索）に分解し、それらを文書上で実行して答えを得ます。PEARLの各段階は、最小限の人的入力を伴うゼロショットまたは少数ショットのプロンプト（本研究ではGPT-4を使用）によって実装されます。PEARLを、長い物語文書に対する複雑な推論を必要とするQuALITYデータセットの難易度の高いサブセットで評価しました。その結果、PEARLはゼロショットおよび連鎖的思考プロンプトを上回る性能を示し、アブレーション実験ではPEARLの各段階がその性能に不可欠であることが明らかになりました。全体として、PEARLはLLMを活用して長文書を推論するための第一歩となります。

English

Strategies such as chain-of-thought prompting improve the performance of large language models (LLMs) on complex reasoning tasks by decomposing input examples into intermediate steps. However, it remains unclear how to apply such methods to reason over long input documents, in which both the decomposition and the output of each intermediate step are non-trivial to obtain. In this work, we propose PEARL, a prompting framework to improve reasoning over long documents, which consists of three stages: action mining, plan formulation, and plan execution. More specifically, given a question about a long document, PEARL decomposes the question into a sequence of actions (e.g., SUMMARIZE, FIND_EVENT, FIND_RELATION) and then executes them over the document to obtain the answer. Each stage of PEARL is implemented via zero-shot or few-shot prompting of LLMs (in our work, GPT-4) with minimal human input. We evaluate PEARL on a challenging subset of the QuALITY dataset, which contains questions that require complex reasoning over long narrative texts. PEARL outperforms zero-shot and chain-of-thought prompting on this dataset, and ablation experiments show that each stage of PEARL is critical to its performance. Overall, PEARL is a first step towards leveraging LLMs to reason over long documents.

PEARL: 大規模言語モデルに長文ドキュメントに対する計画立案と実行を促すプロンプティング

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents

要旨

Support