先驗知識的審慎考量:大型語言模型在知識圖譜上的可信推理
Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs
May 21, 2025
作者: Jie Ma, Ning Qu, Zhitao Gao, Rui Xing, Jun Liu, Hongbin Pei, Jiang Xie, Linyun Song, Pinghui Wang, Jing Tao, Zhou Su
cs.AI
摘要
基於知識圖譜的檢索增強生成旨在緩解大型語言模型(LLMs)因知識不足或過時而產生的幻覺問題。然而,現有方法往往未能充分利用知識圖譜(KGs)中嵌入的先驗知識,尤其是其結構信息及顯式或隱式約束。前者能增強LLMs推理的忠實性,後者則能提升回應生成的可靠性。基於這些動機,我們提出了一個可信的推理框架,稱為“先驗審議”(Deliberation over Priors, DP),該框架充分利用了KGs中的先驗知識。具體而言,DP採用了一種漸進式知識蒸餾策略,通過結合監督微調和Kahneman-Tversky優化,將結構先驗整合到LLMs中,從而提高關係路徑生成的忠實性。此外,我們的框架還採用了推理-內省策略,引導LLMs基於提取的約束先驗進行精細化的推理驗證,確保回應生成的可靠性。在三個基準數據集上的大量實驗表明,DP達到了新的最優性能,特別是在ComplexWebQuestions數據集上實現了13%的Hit@1提升,並生成了高度可信的回應。我們還進行了多種分析以驗證其靈活性和實用性。代碼已公開於https://github.com/reml-group/Deliberation-on-Priors。
English
Knowledge graph-based retrieval-augmented generation seeks to mitigate
hallucinations in Large Language Models (LLMs) caused by insufficient or
outdated knowledge. However, existing methods often fail to fully exploit the
prior knowledge embedded in knowledge graphs (KGs), particularly their
structural information and explicit or implicit constraints. The former can
enhance the faithfulness of LLMs' reasoning, while the latter can improve the
reliability of response generation. Motivated by these, we propose a
trustworthy reasoning framework, termed Deliberation over Priors (DP), which
sufficiently utilizes the priors contained in KGs. Specifically, DP adopts a
progressive knowledge distillation strategy that integrates structural priors
into LLMs through a combination of supervised fine-tuning and Kahneman-Tversky
optimization, thereby improving the faithfulness of relation path generation.
Furthermore, our framework employs a reasoning-introspection strategy, which
guides LLMs to perform refined reasoning verification based on extracted
constraint priors, ensuring the reliability of response generation. Extensive
experiments on three benchmark datasets demonstrate that DP achieves new
state-of-the-art performance, especially a Hit@1 improvement of 13% on the
ComplexWebQuestions dataset, and generates highly trustworthy responses. We
also conduct various analyses to verify its flexibility and practicality. The
code is available at https://github.com/reml-group/Deliberation-on-Priors.Summary
AI-Generated Summary