StructRAG：透過推理時間的混合信息結構化，增強LLM的知識密集型推理

摘要

檢索增強生成（RAG）是在許多基於知識的任務中有效增強大型語言模型（LLMs）的關鍵手段。然而，現有的RAG方法在處理知識密集型推理任務時遇到困難，因為這些任務所需的有用信息分散不利。這種特徵使得現有的RAG方法難以準確識別關鍵信息並在這種嘈雜的增強中進行全局推理。本文受到認知理論的啟發，即人類在應對知識密集型推理時將原始信息轉換為各種結構化知識，提出了一個新框架，名為StructRAG，該框架可以識別適合當前任務的最佳結構類型，將原始文檔重構為這種結構化格式，並基於結果結構推斷答案。在各種知識密集型任務上進行的大量實驗表明，StructRAG實現了最先進的性能，特別擅長應對具有挑戰性的情境，展示了其作為增強LLMs在複雜現實應用中的有效解決方案的潛力。

English

Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios, demonstrating its potential as an effective solution for enhancing LLMs in complex real-world applications.

StructRAG：透過推理時間的混合信息結構化，增強LLM的知識密集型推理

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

摘要

Support