ChatPaper.aiChatPaper

自我引用:大型語言模型中用於上下文歸因的自監督對齊

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

February 13, 2025
作者: Yung-Sung Chuang, Benjamin Cohen-Wang, Shannon Zejiang Shen, Zhaofeng Wu, Hu Xu, Xi Victoria Lin, James Glass, Shang-Wen Li, Wen-tau Yih
cs.AI

摘要

我們介紹了SelfCite,一種新穎的自監督方法,它對齊LLM以生成高質量、細粒度、句級引文,用於其生成的回應中的陳述。SelfCite不僅依賴昂貴且勞動密集的標註,還利用LLM通過上下文消除提供的獎勵信號:如果需要引文,則從上下文中刪除引用的文本應該防止相同的回應;如果足夠,則僅保留引用的文本應該保留相同的回應。這種獎勵可以引導推論時的最佳N採樣策略,顯著改善引文質量,並可用於偏好優化,直接微調模型以生成更好的引文。SelfCite的有效性通過在五個長形式問答任務中跨LongBench-Cite基準測試,將引文F1提高多達5.3個百分點來加以證明。
English
We introduce SelfCite, a novel self-supervised approach that aligns LLMs to generate high-quality, fine-grained, sentence-level citations for the statements in their generated responses. Instead of only relying on costly and labor-intensive annotations, SelfCite leverages a reward signal provided by the LLM itself through context ablation: If a citation is necessary, removing the cited text from the context should prevent the same response; if sufficient, retaining the cited text alone should preserve the same response. This reward can guide the inference-time best-of-N sampling strategy to improve citation quality significantly, as well as be used in preference optimization to directly fine-tune the models for generating better citations. The effectiveness of SelfCite is demonstrated by increasing citation F1 up to 5.3 points on the LongBench-Cite benchmark across five long-form question answering tasks.

Summary

AI-Generated Summary

PDF362February 14, 2025