NodeRAG:基於異構節點的圖結構化檢索增強生成
NodeRAG: Structuring Graph-based RAG with Heterogeneous Nodes
April 15, 2025
作者: Tianyang Xu, Haojie Zheng, Chengze Li, Haoxiang Chen, Yixin Liu, Ruoxi Chen, Lichao Sun
cs.AI
摘要
檢索增強生成(RAG)賦能大型語言模型,使其能夠訪問外部及私有語料庫,從而在特定領域內提供事實一致的回應。通過利用語料庫的固有結構,基於圖的RAG方法進一步豐富了這一過程,它們構建知識圖索引並發揮圖的結構特性。然而,當前的基於圖的RAG方法鮮少優先考慮圖結構的設計。設計不當的圖不僅阻礙了多樣化圖算法的無縫集成,還導致工作流程不一致和性能下降。為了進一步釋放圖在RAG中的潛力,我們提出了NodeRAG,這是一個以圖為中心的框架,引入了異構圖結構,使得基於圖的方法能夠無縫且全面地融入RAG工作流程。通過緊密對齊大型語言模型的能力,該框架確保了完全連貫且高效的端到端流程。通過大量實驗,我們證明NodeRAG在索引時間、查詢時間和存儲效率上均優於先前的方法,包括GraphRAG和LightRAG,同時在多跳基準測試和開放式頭對頭評估中,以最少的檢索標記展現出更優的問答性能。我們的GitHub倉庫可於https://github.com/Terry-Xu-666/NodeRAG查看。
English
Retrieval-augmented generation (RAG) empowers large language models to access
external and private corpus, enabling factually consistent responses in
specific domains. By exploiting the inherent structure of the corpus,
graph-based RAG methods further enrich this process by building a knowledge
graph index and leveraging the structural nature of graphs. However, current
graph-based RAG approaches seldom prioritize the design of graph structures.
Inadequately designed graph not only impede the seamless integration of diverse
graph algorithms but also result in workflow inconsistencies and degraded
performance. To further unleash the potential of graph for RAG, we propose
NodeRAG, a graph-centric framework introducing heterogeneous graph structures
that enable the seamless and holistic integration of graph-based methodologies
into the RAG workflow. By aligning closely with the capabilities of LLMs, this
framework ensures a fully cohesive and efficient end-to-end process. Through
extensive experiments, we demonstrate that NodeRAG exhibits performance
advantages over previous methods, including GraphRAG and LightRAG, not only in
indexing time, query time, and storage efficiency but also in delivering
superior question-answering performance on multi-hop benchmarks and open-ended
head-to-head evaluations with minimal retrieval tokens. Our GitHub repository
could be seen at https://github.com/Terry-Xu-666/NodeRAG.Summary
AI-Generated Summary