DeCoRe：通過對比檢索頭解碼以減輕幻覺

摘要

大型語言模型（LLMs）常常出現幻覺，通過誤解所提供的上下文或錯誤回憶內部知識而產生不忠實或事實不正確的輸出。最近的研究已識別出Transformer架構中的特定注意力頭，稱為檢索頭，負責提取相關的上下文信息。我們假設遮蔽這些檢索頭可能會誘發幻覺，並且對比基本LLM和遮蔽LLM的輸出可以減少幻覺。為此，我們提出了對比檢索頭解碼（DeCoRe），這是一種新穎的無需訓練的解碼策略，可以增強上下文和模型參數中發現的信息。DeCoRe通過動態對比基本LLM和遮蔽LLM的輸出，使用條件熵作為指導，來減輕潛在的幻覺回應。我們的大量實驗證實，DeCoRe顯著改善了需要高上下文忠實度的任務表現，例如摘要（XSum提高了18.6％）、遵循指示（MemoTrap提高了10.9％）和開放式問答（NQ-Open提高了2.4％，NQ-Swap提高了5.5％）。

English

Large Language Models (LLMs) often hallucinate, producing unfaithful or factually incorrect outputs by misrepresenting the provided context or incorrectly recalling internal knowledge. Recent studies have identified specific attention heads within the Transformer architecture, known as retrieval heads, responsible for extracting relevant contextual information. We hypothesise that masking these retrieval heads can induce hallucinations and that contrasting the outputs of the base LLM and the masked LLM can reduce hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads (DeCoRe), a novel training-free decoding strategy that amplifies information found in the context and model parameters. DeCoRe mitigates potentially hallucinated responses by dynamically contrasting the outputs of the base LLM and the masked LLM, using conditional entropy as a guide. Our extensive experiments confirm that DeCoRe significantly improves performance on tasks requiring high contextual faithfulness, such as summarisation (XSum by 18.6%), instruction following (MemoTrap by 10.9%), and open-book question answering (NQ-Open by 2.4% and NQ-Swap by 5.5%).

DeCoRe：通過對比檢索頭解碼以減輕幻覺

DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations

摘要

Support