DynamicRAG：利用大型语言模型输出作为反馈，实现检索增强生成中的动态重排序

摘要

检索增强生成（RAG）系统将大型语言模型（LLMs）与外部知识检索相结合，使其在知识密集型任务中表现出色。这些系统中一个关键但常被忽视的组件是重排序器，它通过优化检索到的文档来提升生成质量和可解释性。然而，如何选择最优的文档数量（k）仍是一个未解难题：过少可能遗漏关键信息，过多则引入噪声和低效。尽管近期研究探索了基于LLM的重排序器，但它们主要依赖模型内部知识，忽视了LLM可提供的丰富监督信号，例如利用响应质量作为优化重排序决策的反馈。本文提出DynamicRAG，一种新颖的RAG框架，其中重排序器根据查询动态调整检索文档的顺序和数量。我们将重排序器建模为通过强化学习（RL）优化的智能体，利用LLM输出质量作为奖励信号。在七个知识密集型数据集上的实验表明，DynamicRAG展现出卓越性能，达到了最先进水平。模型、数据和代码已公开于https://github.com/GasolSun36/DynamicRAG。

English

Retrieval-augmented generation (RAG) systems combine large language models (LLMs) with external knowledge retrieval, making them highly effective for knowledge-intensive tasks. A crucial but often under-explored component of these systems is the reranker, which refines retrieved documents to enhance generation quality and explainability. The challenge of selecting the optimal number of documents (k) remains unsolved: too few may omit critical information, while too many introduce noise and inefficiencies. Although recent studies have explored LLM-based rerankers, they primarily leverage internal model knowledge and overlook the rich supervisory signals that LLMs can provide, such as using response quality as feedback for optimizing reranking decisions. In this paper, we propose DynamicRAG, a novel RAG framework where the reranker dynamically adjusts both the order and number of retrieved documents based on the query. We model the reranker as an agent optimized through reinforcement learning (RL), using rewards derived from LLM output quality. Across seven knowledge-intensive datasets, DynamicRAG demonstrates superior performance, achieving state-of-the-art results. The model, data and code are available at https://github.com/GasolSun36/DynamicRAG

DynamicRAG：利用大型语言模型输出作为反馈，实现检索增强生成中的动态重排序

DynamicRAG: Leveraging Outputs of Large Language Model as Feedback for Dynamic Reranking in Retrieval-Augmented Generation

摘要

Support