頻譜投影評分:在檢索增強生成中對齊檢索摘要與讀者模型
Spectrum Projection Score: Aligning Retrieved Summaries with Reader Models in Retrieval-Augmented Generation
August 8, 2025
作者: Zhanghao Hu, Qinglin Zhu, Siya Qi, Yulan He, Hanqi Yan, Lin Gui
cs.AI
摘要
大型語言模型(LLMs)通過遵循檢索-閱讀範式的檢索增強生成(RAG)技術,展現了生成性能的提升,該技術通過外部檢索的知識來補充模型輸入。然而,先前的研究往往對RAG進行整體評估,將檢索器與閱讀器聯合考量,這使得難以孤立地評估檢索的真實貢獻,尤其是在使用LLMs作為閱讀器時,其對提示的敏感性更為突出。我們引入了光譜投影分數(SPS),這是一種輕量級、無需監督的度量方法,它允許閱讀器通過比較由摘要生成的詞彙所形成的區域與閱讀器子空間的主方向,來衡量檢索摘要與其隱藏表徵的語義對齊程度,從而評估相關性。基於SPS,我們提出了xCompress,這是一個推理時間控制器框架,能夠動態地採樣、排序並壓縮檢索摘要候選項。在五個問答基準測試和四種開源LLMs上的廣泛實驗表明,SPS不僅提升了一系列任務的性能,還為檢索與生成之間的互動提供了原則性的視角。
English
Large Language Models (LLMs) have shown improved generation performance
through retrieval-augmented generation (RAG) following the retriever-reader
paradigm, which supplements model inputs with externally retrieved knowledge.
However, prior work often evaluates RAG holistically, assessing the retriever
and reader jointly, making it difficult to isolate the true contribution of
retrieval, particularly given the prompt sensitivity of LLMs used as readers.
We introduce Spectrum Projection Score (SPS), a lightweight, supervision-free
metric that allows the reader to gauge the semantic alignment of a retrieved
summary with its hidden representation by comparing the area formed by
generated tokens from the summary, and the principal directions of subspace in
the reader and to measure the relevance. Building on SPS we present xCompress,
an inference time controller framework that dynamically samples, ranks, and
compresses retrieval summary candidates. Extensive experiments on five QA
benchmarks with four open source LLMs show that SPS not only enhances
performance across a range of tasks but also provides a principled perspective
on the interaction between retrieval and generation.