StructRAG：推論時間のハイブリッド情報構造化を介したLLMの知識集約的推論の強化

要旨

情報取得増強生成（RAG）は、多くの知識ベースタスクにおいて大規模言語モデル（LLM）を効果的に強化するための重要な手段です。しかし、既存のRAG手法は、知識集約型の推論タスクに苦労しています。なぜなら、これらのタスクに必要な有用な情報が散在しているためです。この特性により、既存のRAG手法は、鍵となる情報を正確に特定し、そのようなノイズの多い拡張を用いてグローバルな推論を行うことが困難となります。本論文では、知識集約型の推論に取り組む際に人間が生の情報をさまざまな構造化された知識に変換する認知理論に着想を得て、タスクに最適な構造タイプを特定し、元の文書をこの構造化された形式に再構築し、その結果の構造に基づいて回答を推論することができる新しいフレームワークであるStructRAGを提案します。さまざまな知識集約型タスクを対象とした包括的な実験により、StructRAGが最先端の性能を達成し、特に困難なシナリオで優れた成績を収め、複雑な実世界のアプリケーションにおいてLLMを強化するための効果的な解決策としての潜在能力を示しています。

English

Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios, demonstrating its potential as an effective solution for enhancing LLMs in complex real-world applications.

StructRAG：推論時間のハイブリッド情報構造化を介したLLMの知識集約的推論の強化

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

要旨

Support