了解LLM的需求:双重偏好对齐用于检索增强生成
Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation
June 26, 2024
作者: Guanting Dong, Yutao Zhu, Chenghao Zhang, Zechen Wang, Zhicheng Dou, Ji-Rong Wen
cs.AI
摘要
检索增强生成(RAG)已经证明在缓解大型语言模型(LLMs)的幻觉问题方面是有效的。然而,将检索器与多样的LLMs知识偏好对齐的困难不可避免地在开发可靠的RAG系统中提出了挑战。为了解决这个问题,我们提出了DPA-RAG,这是一个旨在对齐RAG系统内多样化知识偏好的通用框架。具体而言,我们首先引入了一个偏好知识构建流程,并结合五种新颖的查询增强策略来缓解偏好数据稀缺性。基于偏好数据,DPA-RAG实现了外部和内部偏好对齐:1)它将成对、点对和对比偏好对齐能力共同整合到重新排序器中,实现了RAG组件之间的外部偏好对齐。2)它进一步引入了一个预对齐阶段,位于普通监督微调(SFT)之前,使LLMs能够隐式捕捉与其推理偏好对齐的知识,实现了LLMs的内部对齐。在四个知识密集型QA数据集上的实验结果表明,DPA-RAG优于所有基线,并无缝集成了黑盒和开源LLM读者。进一步的定性分析和讨论还为实现可靠的RAG系统提供了经验指导。我们的代码公开可用于https://github.com/dongguanting/DPA-RAG。
English
Retrieval-augmented generation (RAG) has demonstrated effectiveness in
mitigating the hallucination problem of large language models (LLMs). However,
the difficulty of aligning the retriever with the diverse LLMs' knowledge
preferences inevitably poses an inevitable challenge in developing a reliable
RAG system. To address this issue, we propose DPA-RAG, a universal framework
designed to align diverse knowledge preferences within RAG systems.
Specifically, we initially introduce a preference knowledge construction
pipline and incorporate five novel query augmentation strategies to alleviate
preference data scarcity. Based on preference data, DPA-RAG accomplishes both
external and internal preference alignment: 1) It jointly integrate pair-wise,
point-wise, and contrastive preference alignment abilities into the reranker,
achieving external preference alignment among RAG components. 2) It further
introduces a pre-aligned stage before vanilla Supervised Fine-tuning (SFT),
enabling LLMs to implicitly capture knowledge aligned with their reasoning
preferences, achieving LLMs' internal alignment. Experimental results across
four knowledge-intensive QA datasets demonstrate that DPA-RAG outperforms all
baselines and seamlessly integrates both black-box and open-sourced LLM
readers. Further qualitative analysis and discussions also provide empirical
guidance for achieving reliable RAG systems. Our code is publicly available at
https://github.com/dongguanting/DPA-RAG.Summary
AI-Generated Summary