REARANK:基于强化学习的推理重排序智能体
REARANK: Reasoning Re-ranking Agent via Reinforcement Learning
May 26, 2025
作者: Le Zhang, Bo Wang, Xipeng Qiu, Siva Reddy, Aishwarya Agrawal
cs.AI
摘要
我们提出了REARANK,一种基于大语言模型(LLM)的列表式推理重排序代理。REARANK在重排序前进行显式推理,显著提升了性能与可解释性。通过强化学习与数据增强技术,REARANK在多个主流信息检索基准测试中较基线模型取得了显著进步,尤其值得注意的是,仅需179个标注样本即可实现。基于Qwen2.5-7B构建的REARANK-7B,在领域内及跨领域基准测试中展现出与GPT-4相媲美的性能,并在推理密集型的BRIGHT基准测试中甚至超越了GPT-4。这些成果验证了我们方法的有效性,并凸显了强化学习在提升LLM重排序推理能力方面的潜力。
English
We present REARANK, a large language model (LLM)-based listwise reasoning
reranking agent. REARANK explicitly reasons before reranking, significantly
improving both performance and interpretability. Leveraging reinforcement
learning and data augmentation, REARANK achieves substantial improvements over
baseline models across popular information retrieval benchmarks, notably
requiring only 179 annotated samples. Built on top of Qwen2.5-7B, our
REARANK-7B demonstrates performance comparable to GPT-4 on both in-domain and
out-of-domain benchmarks and even surpasses GPT-4 on reasoning-intensive BRIGHT
benchmarks. These results underscore the effectiveness of our approach and
highlight how reinforcement learning can enhance LLM reasoning capabilities in
reranking.Summary
AI-Generated Summary