DeCoRe:通過對比檢索頭解碼以減輕幻覺
DeCoRe: Decoding by Contrasting Retrieval Heads to Mitigate Hallucinations
October 24, 2024
作者: Aryo Pradipta Gema, Chen Jin, Ahmed Abdulaal, Tom Diethe, Philip Teare, Beatrice Alex, Pasquale Minervini, Amrutha Saseendran
cs.AI
摘要
大型語言模型(LLMs)常常出現幻覺,通過誤解所提供的上下文或錯誤回憶內部知識而產生不忠實或事實不正確的輸出。最近的研究已識別出Transformer架構中的特定注意力頭,稱為檢索頭,負責提取相關的上下文信息。我們假設遮蔽這些檢索頭可能會誘發幻覺,並且對比基本LLM和遮蔽LLM的輸出可以減少幻覺。為此,我們提出了對比檢索頭解碼(DeCoRe),這是一種新穎的無需訓練的解碼策略,可以增強上下文和模型參數中發現的信息。DeCoRe通過動態對比基本LLM和遮蔽LLM的輸出,使用條件熵作為指導,來減輕潛在的幻覺回應。我們的大量實驗證實,DeCoRe顯著改善了需要高上下文忠實度的任務表現,例如摘要(XSum提高了18.6%)、遵循指示(MemoTrap提高了10.9%)和開放式問答(NQ-Open提高了2.4%,NQ-Swap提高了5.5%)。
English
Large Language Models (LLMs) often hallucinate, producing unfaithful or
factually incorrect outputs by misrepresenting the provided context or
incorrectly recalling internal knowledge. Recent studies have identified
specific attention heads within the Transformer architecture, known as
retrieval heads, responsible for extracting relevant contextual information. We
hypothesise that masking these retrieval heads can induce hallucinations and
that contrasting the outputs of the base LLM and the masked LLM can reduce
hallucinations. To this end, we propose Decoding by Contrasting Retrieval Heads
(DeCoRe), a novel training-free decoding strategy that amplifies information
found in the context and model parameters. DeCoRe mitigates potentially
hallucinated responses by dynamically contrasting the outputs of the base LLM
and the masked LLM, using conditional entropy as a guide. Our extensive
experiments confirm that DeCoRe significantly improves performance on tasks
requiring high contextual faithfulness, such as summarisation (XSum by 18.6%),
instruction following (MemoTrap by 10.9%), and open-book question answering
(NQ-Open by 2.4% and NQ-Swap by 5.5%).Summary
AI-Generated Summary