GKG-LLM:通用知識圖譜構建的統一框架
GKG-LLM: A Unified Framework for Generalized Knowledge Graph Construction
March 14, 2025
作者: Jian Zhang, Bifan Wei, Shihao Qi, haiping Zhu, Jun Liu, Qika Lin
cs.AI
摘要
構建廣義知識圖譜(Generalized Knowledge Graph, GKG),包括知識圖譜、事件知識圖譜和常識知識圖譜,是各類自然語言處理任務的基礎。當前研究通常分別構建這些圖譜,忽視了整體視角以及可能在計算資源和使用角度上帶來益處的潛在統一性。然而,開發統一框架以構建GKG的一個關鍵挑戰在於任務特定差異所帶來的障礙。在本研究中,我們提出了一個統一框架來構建廣義知識圖譜,以應對這一挑戰。首先,我們從三種類型圖譜的29個數據集中收集了15個子任務的數據,並將其分類為樣本內數據、對抗任務數據和分佈外(Out-of-Distribution, OOD)數據。接著,我們提出了一個三階段課程學習微調框架,通過迭代地將三種類型圖譜的知識注入大型語言模型中。大量實驗表明,我們提出的模型在域內、OOD和對抗任務數據上均提升了所有三種圖譜的構建效果。
English
The construction of Generalized Knowledge Graph (GKG), including knowledge
graph, event knowledge graph and commonsense knowledge graph, is fundamental
for various natural language processing tasks. Current studies typically
construct these types of graph separately, overlooking holistic insights and
potential unification that could be beneficial in computing resources and usage
perspectives. However, a key challenge in developing a unified framework for
GKG is obstacles arising from task-specific differences. In this study, we
propose a unified framework for constructing generalized knowledge graphs to
address this challenge. First, we collect data from 15 sub-tasks in 29 datasets
across the three types of graphs, categorizing them into in-sample,
counter-task, and out-of-distribution (OOD) data. Then, we propose a
three-stage curriculum learning fine-tuning framework, by iteratively injecting
knowledge from the three types of graphs into the Large Language Models.
Extensive experiments show that our proposed model improves the construction of
all three graph types across in-domain, OOD and counter-task data.Summary
AI-Generated Summary