ChatPaper.aiChatPaper

MemoRAG:通过受记忆启发的知识发现迈向下一代RAG

MemoRAG: Moving towards Next-Gen RAG Via Memory-Inspired Knowledge Discovery

September 9, 2024
作者: Hongjin Qian, Peitian Zhang, Zheng Liu, Kelong Mao, Zhicheng Dou
cs.AI

摘要

检索增强生成(RAG)利用检索工具访问外部数据库,从而通过优化上下文提高大型语言模型(LLMs)的生成质量。然而,现有的检索方法固有地受限,因为它们只能在明确陈述的查询和格式良好的知识之间执行相关性匹配,无法处理涉及模糊信息需求或非结构化知识的任务。因此,现有的RAG系统主要适用于直接的问答任务。在这项工作中,我们提出了MemoRAG,这是一种由长期记忆赋能的新型检索增强生成范式。MemoRAG采用双系统架构。一方面,它采用轻量但远程的LLM来形成数据库的全局记忆。一旦出现任务,它生成草稿答案,提示检索工具在数据库中定位有用信息。另一方面,它利用昂贵但表达能力强的LLM,基于检索到的信息生成最终答案。在这一通用框架基础上,我们通过增强其提示机制和记忆容量进一步优化了MemoRAG的性能。在我们的实验中,MemoRAG在各种评估任务中取得了优越的性能,包括传统RAG失败的复杂任务和常见应用RAG的直接任务。
English
Retrieval-Augmented Generation (RAG) leverages retrieval tools to access external databases, thereby enhancing the generation quality of large language models (LLMs) through optimized context. However, the existing retrieval methods are constrained inherently, as they can only perform relevance matching between explicitly stated queries and well-formed knowledge, but unable to handle tasks involving ambiguous information needs or unstructured knowledge. Consequently, existing RAG systems are primarily effective for straightforward question-answering tasks. In this work, we propose MemoRAG, a novel retrieval-augmented generation paradigm empowered by long-term memory. MemoRAG adopts a dual-system architecture. On the one hand, it employs a light but long-range LLM to form the global memory of database. Once a task is presented, it generates draft answers, cluing the retrieval tools to locate useful information within the database. On the other hand, it leverages an expensive but expressive LLM, which generates the ultimate answer based on the retrieved information. Building on this general framework, we further optimize MemoRAG's performance by enhancing its cluing mechanism and memorization capacity. In our experiment, MemoRAG achieves superior performance across a variety of evaluation tasks, including both complex ones where conventional RAG fails and straightforward ones where RAG is commonly applied.

Summary

AI-Generated Summary

PDF324November 16, 2024