StructRAG：通过推理时间混合信息结构化来增强LLMs的知识密集型推理

摘要

检索增强生成（RAG）是在许多基于知识的任务中有效增强大型语言模型（LLMs）的关键手段。然而，现有的RAG方法在处理知识密集型推理任务时存在困难，因为这些任务所需的有用信息分散在各处。这一特点使得现有的RAG方法难以准确识别关键信息，并在这种嘈杂的增强中进行全局推理。在本文中，受认知理论的启发，即人类在处理知识密集型推理时将原始信息转化为各种结构化知识，我们提出了一个新框架，StructRAG，它可以识别适合当前任务的最佳结构类型，将原始文档重构为这种结构化格式，并根据生成的结构推断答案。在各种知识密集型任务上进行的大量实验表明，StructRAG实现了最先进的性能，特别擅长应对具有挑战性的场景，展示了其作为增强LLMs在复杂实际应用中的有效解决方案的潜力。

English

Retrieval-augmented generation (RAG) is a key means to effectively enhance large language models (LLMs) in many knowledge-based tasks. However, existing RAG methods struggle with knowledge-intensive reasoning tasks, because useful information required to these tasks are badly scattered. This characteristic makes it difficult for existing RAG methods to accurately identify key information and perform global reasoning with such noisy augmentation. In this paper, motivated by the cognitive theories that humans convert raw information into various structured knowledge when tackling knowledge-intensive reasoning, we proposes a new framework, StructRAG, which can identify the optimal structure type for the task at hand, reconstruct original documents into this structured format, and infer answers based on the resulting structure. Extensive experiments across various knowledge-intensive tasks show that StructRAG achieves state-of-the-art performance, particularly excelling in challenging scenarios, demonstrating its potential as an effective solution for enhancing LLMs in complex real-world applications.

StructRAG：通过推理时间混合信息结构化来增强LLMs的知识密集型推理

StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization

摘要

Support