知识编辑能真正纠正幻觉吗?
Can Knowledge Editing Really Correct Hallucinations?
October 21, 2024
作者: Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu
cs.AI
摘要
大型语言模型(LLMs)存在幻觉问题,指生成内容中的非事实信息,尽管它们在各项任务上具有出色的能力。同时,知识编辑已经发展成为一种新的流行范式,用于纠正LLMs中编码的错误事实知识,其优势在于避免从头开始重新训练。然而,现有知识编辑评估数据集的一个普遍问题是,它们并未确保LLMs在编辑之前实际生成幻觉答案以回答评估问题。当LLMs在经过不同技术编辑后在这类数据集上进行评估时,很难直接采用性能来评估不同知识编辑方法在纠正幻觉方面的有效性。因此,一个基本问题仍然缺乏验证:知识编辑是否真的能够纠正LLMs中的幻觉?我们提出了HalluEditBench,全面评估知识编辑方法在纠正现实世界幻觉方面的表现。首先,我们严谨构建了一个包含9个领域、26个主题和6000多个幻觉的大规模幻觉数据集。然后,我们从效能、泛化性、可移植性、局部性和鲁棒性等五个维度全面评估知识编辑方法的表现。通过HalluEditBench,我们为不同知识编辑方法在纠正幻觉方面的潜力和局限性提供了新的见解,这可能激发未来的改进并促进知识编辑领域的进展。
English
Large Language Models (LLMs) suffer from hallucinations, referring to the
non-factual information in generated content, despite their superior capacities
across tasks. Meanwhile, knowledge editing has been developed as a new popular
paradigm to correct the erroneous factual knowledge encoded in LLMs with the
advantage of avoiding retraining from scratch. However, one common issue of
existing evaluation datasets for knowledge editing is that they do not ensure
LLMs actually generate hallucinated answers to the evaluation questions before
editing. When LLMs are evaluated on such datasets after being edited by
different techniques, it is hard to directly adopt the performance to assess
the effectiveness of different knowledge editing methods in correcting
hallucinations. Thus, the fundamental question remains insufficiently
validated: Can knowledge editing really correct hallucinations in LLMs? We
proposed HalluEditBench to holistically benchmark knowledge editing methods in
correcting real-world hallucinations. First, we rigorously construct a massive
hallucination dataset with 9 domains, 26 topics and more than 6,000
hallucinations. Then, we assess the performance of knowledge editing methods in
a holistic way on five dimensions including Efficacy, Generalization,
Portability, Locality, and Robustness. Through HalluEditBench, we have provided
new insights into the potentials and limitations of different knowledge editing
methods in correcting hallucinations, which could inspire future improvements
and facilitate the progress in the field of knowledge editing.Summary
AI-Generated Summary