複雑なQAタスクにおけるLLMの推論能力を強化するための動的メモ書き手法

要旨

マルチホップ質問応答における反復的RAGは、長文脈と無関係な情報の蓄積という課題に直面しています。これにより、モデルの検索内容の処理と推論能力が阻害され、性能が制限されます。最近の手法では検索情報の圧縮に焦点を当てていますが、それらは単一ラウンドのRAGに限定されていたり、ファインチューニングを必要としたり、反復的RAGにおけるスケーラビリティが欠如していたりします。これらの課題に対処するため、我々はNotes Writingを提案します。これは各ステップで検索されたドキュメントから簡潔で関連性の高いノートを生成し、ノイズを削減して本質的な情報のみを保持する手法です。これにより、大規模言語モデル（LLM）の実質的な文脈長が間接的に増加し、より大量の入力テキストを処理しながら効果的に推論と計画を行うことが可能になります。Notes Writingはフレームワークに依存せず、様々な反復的RAG手法と統合可能です。我々は3つの反復的RAG手法、2つのモデル、4つの評価データセットを用いてその有効性を実証しました。Notes Writingは全体で平均15.6パーセントポイントの改善をもたらし、出力トークンの増加は最小限に抑えられました。

English

Iterative RAG for multi-hop question answering faces challenges with lengthy contexts and the buildup of irrelevant information. This hinders a model's capacity to process and reason over retrieved content and limits performance. While recent methods focus on compressing retrieved information, they are either restricted to single-round RAG, require finetuning or lack scalability in iterative RAG. To address these challenges, we propose Notes Writing, a method that generates concise and relevant notes from retrieved documents at each step, thereby reducing noise and retaining only essential information. This indirectly increases the effective context length of Large Language Models (LLMs), enabling them to reason and plan more effectively while processing larger volumes of input text. Notes Writing is framework agnostic and can be integrated with different iterative RAG methods. We demonstrate its effectiveness with three iterative RAG methods, across two models and four evaluation datasets. Notes writing yields an average improvement of 15.6 percentage points overall, with minimal increase in output tokens.

複雑なQAタスクにおけるLLMの推論能力を強化するための動的メモ書き手法

Augmenting LLM Reasoning with Dynamic Notes Writing for Complex QA

要旨

Support