LLMのニーズを理解する：検索拡張生成のための二重選好アラインメント

要旨

検索拡張生成（RAG）は、大規模言語モデル（LLM）の幻覚問題を緩和する効果を実証してきました。しかし、多様なLLMの知識選好と検索器を整合させる難しさは、信頼性の高いRAGシステムを開発する上で避けられない課題となっています。この問題に対処するため、我々はDPA-RAGを提案します。これは、RAGシステム内の多様な知識選好を整合させるための汎用フレームワークです。具体的には、まず選好知識構築パイプラインを導入し、選好データの不足を緩和するために5つの新しいクエリ拡張戦略を組み込みます。選好データに基づいて、DPA-RAGは外部と内部の選好整合を実現します：1）ペアワイズ、ポイントワイズ、および対照的な選好整合能力をリランカーに統合し、RAGコンポーネント間の外部選好整合を達成します。2）通常の教師ありファインチューニング（SFT）の前に事前整合段階を導入し、LLMがその推論選好に整合した知識を暗黙的に捕捉できるようにし、LLMの内部整合を実現します。4つの知識集約型QAデータセットでの実験結果は、DPA-RAGが全てのベースラインを上回り、ブラックボックスおよびオープンソースのLLMリーダーをシームレスに統合することを示しています。さらに、質的分析と議論は、信頼性の高いRAGシステムを実現するための実証的なガイダンスを提供します。我々のコードはhttps://github.com/dongguanting/DPA-RAGで公開されています。

English

Retrieval-augmented generation (RAG) has demonstrated effectiveness in mitigating the hallucination problem of large language models (LLMs). However, the difficulty of aligning the retriever with the diverse LLMs' knowledge preferences inevitably poses an inevitable challenge in developing a reliable RAG system. To address this issue, we propose DPA-RAG, a universal framework designed to align diverse knowledge preferences within RAG systems. Specifically, we initially introduce a preference knowledge construction pipline and incorporate five novel query augmentation strategies to alleviate preference data scarcity. Based on preference data, DPA-RAG accomplishes both external and internal preference alignment: 1) It jointly integrate pair-wise, point-wise, and contrastive preference alignment abilities into the reranker, achieving external preference alignment among RAG components. 2) It further introduces a pre-aligned stage before vanilla Supervised Fine-tuning (SFT), enabling LLMs to implicitly capture knowledge aligned with their reasoning preferences, achieving LLMs' internal alignment. Experimental results across four knowledge-intensive QA datasets demonstrate that DPA-RAG outperforms all baselines and seamlessly integrates both black-box and open-sourced LLM readers. Further qualitative analysis and discussions also provide empirical guidance for achieving reliable RAG systems. Our code is publicly available at https://github.com/dongguanting/DPA-RAG.

LLMのニーズを理解する：検索拡張生成のための二重選好アラインメント

Understand What LLM Needs: Dual Preference Alignment for Retrieval-Augmented Generation

要旨

Support