ChatPaper.aiChatPaper

相似性並非唯一所需:賦予檢索增強生成具有多層思維

Similarity is Not All You Need: Endowing Retrieval Augmented Generation with Multi Layered Thoughts

May 30, 2024
作者: Chunjing Gan, Dan Yang, Binbin Hu, Hanxiao Zhang, Siyuan Li, Ziqi Liu, Yue Shen, Lin Ju, Zhiqiang Zhang, Jinjie Gu, Lei Liang, Jun Zhou
cs.AI

摘要

近年來,大型語言模型(LLMs)在各個領域取得了顯著的成就。然而,LLMs 的知識更新不及時且成本高昂,再加上存在幻覺問題,限制了它們在知識密集任務中的應用,而檢索增強生成(RAG)可以提供幫助。然而,現有的檢索增強模型通常使用相似度作為查詢和文檔之間的橋樑,並遵循檢索然後閱讀的程序。在這項工作中,我們認為相似度並非總是萬靈丹,完全依賴相似度有時會降低檢索增強生成的性能。為此,我們提出了MetRag,一個多層思維增強檢索增強生成框架。首先,除了現有的相似度導向思維,我們採用一個小規模效用模型,從LLM中獲得監督以獲得效用導向思維,並通過全面結合相似度和效用導向思維提出更智能的模型。此外,考慮到檢索到的文檔集往往龐大,單獨使用它們很難捕捉它們之間的共同點和特徵,我們提出將LLM作為任務自適應摘要生成器,賦予檢索增強生成以緊湊導向思維。最後,在前述階段的多層思維的基礎上,需要一個LLM進行知識增強生成。對知識密集型任務的大量實驗證明了MetRag的優越性。
English
In recent years, large language models (LLMs) have made remarkable achievements in various domains. However, the untimeliness and cost of knowledge updates coupled with hallucination issues of LLMs have curtailed their applications in knowledge intensive tasks, where retrieval augmented generation (RAG) can be of help. Nevertheless, existing retrieval augmented models typically use similarity as a bridge between queries and documents and follow a retrieve then read procedure. In this work, we argue that similarity is not always the panacea and totally relying on similarity would sometimes degrade the performance of retrieval augmented generation. To this end, we propose MetRag, a Multi layEred Thoughts enhanced Retrieval Augmented Generation framework. To begin with, beyond existing similarity oriented thought, we embrace a small scale utility model that draws supervision from an LLM for utility oriented thought and further come up with a smarter model by comprehensively combining the similarity and utility oriented thoughts. Furthermore, given the fact that the retrieved document set tends to be huge and using them in isolation makes it difficult to capture the commonalities and characteristics among them, we propose to make an LLM as a task adaptive summarizer to endow retrieval augmented generation with compactness-oriented thought. Finally, with multi layered thoughts from the precedent stages, an LLM is called for knowledge augmented generation. Extensive experiments on knowledge-intensive tasks have demonstrated the superiority of MetRag.

Summary

AI-Generated Summary

PDF322December 12, 2024