Open-RAG：使用开源大型语言模型增强检索增强推理

摘要

检索增强生成（RAG）已被证明能够提升大型语言模型（LLMs）的事实准确性，但现有方法在有效利用检索证据时往往存在推理能力有限的问题，特别是在使用开源LLMs时。为了弥补这一差距，我们引入了一种新颖的框架，Open-RAG，旨在增强使用开源LLMs的RAG的推理能力。我们的框架将任意的密集LLM转换为能够处理复杂推理任务的参数高效的稀疏专家混合模型（MoE），包括单跳和多跳查询。Open-RAG独特地训练模型以应对看似相关但具有误导性的具有挑战性的干扰因素。因此，Open-RAG利用潜在学习，动态选择相关专家并有效整合外部知识，以获得更准确和具有情境相关性的响应。此外，我们提出了一种混合自适应检索方法，以确定检索的必要性并平衡性能提升与推理速度之间的权衡。实验结果表明，基于Llama2-7B的Open-RAG在各种知识密集型任务中优于最先进的LLMs和RAG模型，如ChatGPT、Self-RAG和Command R+。我们在https://openragmoe.github.io/开源我们的代码和模型。

English

Retrieval-Augmented Generation (RAG) has been shown to enhance the factual accuracy of Large Language Models (LLMs), but existing methods often suffer from limited reasoning capabilities in effectively using the retrieved evidence, particularly when using open-source LLMs. To mitigate this gap, we introduce a novel framework, Open-RAG, designed to enhance reasoning capabilities in RAG with open-source LLMs. Our framework transforms an arbitrary dense LLM into a parameter-efficient sparse mixture of experts (MoE) model capable of handling complex reasoning tasks, including both single- and multi-hop queries. Open-RAG uniquely trains the model to navigate challenging distractors that appear relevant but are misleading. As a result, Open-RAG leverages latent learning, dynamically selecting relevant experts and integrating external knowledge effectively for more accurate and contextually relevant responses. In addition, we propose a hybrid adaptive retrieval method to determine retrieval necessity and balance the trade-off between performance gain and inference speed. Experimental results show that the Llama2-7B-based Open-RAG outperforms state-of-the-art LLMs and RAG models such as ChatGPT, Self-RAG, and Command R+ in various knowledge-intensive tasks. We open-source our code and models at https://openragmoe.github.io/

Open-RAG：使用开源大型语言模型增强检索增强推理

Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models

摘要

Support