知识编辑能真正纠正幻觉吗？

摘要

大型语言模型（LLMs）存在幻觉问题，指生成内容中的非事实信息，尽管它们在各项任务上具有出色的能力。同时，知识编辑已经发展成为一种新的流行范式，用于纠正LLMs中编码的错误事实知识，其优势在于避免从头开始重新训练。然而，现有知识编辑评估数据集的一个普遍问题是，它们并未确保LLMs在编辑之前实际生成幻觉答案以回答评估问题。当LLMs在经过不同技术编辑后在这类数据集上进行评估时，很难直接采用性能来评估不同知识编辑方法在纠正幻觉方面的有效性。因此，一个基本问题仍然缺乏验证：知识编辑是否真的能够纠正LLMs中的幻觉？我们提出了HalluEditBench，全面评估知识编辑方法在纠正现实世界幻觉方面的表现。首先，我们严谨构建了一个包含9个领域、26个主题和6000多个幻觉的大规模幻觉数据集。然后，我们从效能、泛化性、可移植性、局部性和鲁棒性等五个维度全面评估知识编辑方法的表现。通过HalluEditBench，我们为不同知识编辑方法在纠正幻觉方面的潜力和局限性提供了新的见解，这可能激发未来的改进并促进知识编辑领域的进展。

English

Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct the erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, one common issue of existing evaluation datasets for knowledge editing is that they do not ensure LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in LLMs? We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations. First, we rigorously construct a massive hallucination dataset with 9 domains, 26 topics and more than 6,000 hallucinations. Then, we assess the performance of knowledge editing methods in a holistic way on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. Through HalluEditBench, we have provided new insights into the potentials and limitations of different knowledge editing methods in correcting hallucinations, which could inspire future improvements and facilitate the progress in the field of knowledge editing.

知识编辑能真正纠正幻觉吗？

Can Knowledge Editing Really Correct Hallucinations?

摘要

Support