突破静态图:基于上下文感知的遍历技术实现鲁棒检索增强生成
Breaking the Static Graph: Context-Aware Traversal for Robust Retrieval-Augmented Generation
February 2, 2026
作者: Kwun Hang Lau, Fangyuan Zhang, Boyu Ruan, Yingli Zhou, Qintian Guo, Ruiyuan Zhang, Xiaofang Zhou
cs.AI
摘要
检索增强生成(RAG)技术的最新进展已从简单的向量相似度匹配转向结构化感知方法,例如利用知识图谱(KG)和个性化网页排序(PPR)捕捉多跳依赖关系的HippoRAG。然而这些方法存在"静态图谬误":它们依赖于索引阶段确定的固定转移概率。这种刚性机制忽略了边关联性随查询动态变化的特性,导致语义漂移问题——随机游走过程在抵达关键下游证据前易被分流至高连接度的"枢纽"节点。因此,现有模型虽能实现较高的局部召回率,却难以完整检索多跳查询所需的证据链。
针对此问题,我们提出上下文感知遍历框架CatRAG(基于HippoRAG 2架构),将静态知识图谱转化为查询自适应的导航结构。该框架通过三重机制引导随机游走:(1)符号锚定技术,通过注入弱实体约束规范游走路径;(2)查询感知动态边权重调节,动态优化图谱结构以剪枝无关路径,同时增强与查询意图对齐的路径;(3)关键事实段落权重增强,采用成本高效的偏置机制将游走过程结构性地锚定至潜在证据区域。
在四个多跳基准测试上的实验表明,CatRAG持续超越现有最优基线。分析显示,虽然标准召回指标提升有限,但本方法在推理完整性(即无间隙恢复完整证据链的能力)上实现显著突破。这些结果印证了我们的方法有效弥合了局部上下文检索与全链路推理之间的鸿沟。项目资源详见:https://github.com/kwunhang/CatRAG。
English
Recent advances in Retrieval-Augmented Generation (RAG) have shifted from simple vector similarity to structure-aware approaches like HippoRAG, which leverage Knowledge Graphs (KGs) and Personalized PageRank (PPR) to capture multi-hop dependencies. However, these methods suffer from a "Static Graph Fallacy": they rely on fixed transition probabilities determined during indexing. This rigidity ignores the query-dependent nature of edge relevance, causing semantic drift where random walks are diverted into high-degree "hub" nodes before reaching critical downstream evidence. Consequently, models often achieve high partial recall but fail to retrieve the complete evidence chain required for multi-hop queries. To address this, we propose CatRAG, Context-Aware Traversal for robust RAG, a framework that builds on the HippoRAG 2 architecture and transforms the static KG into a query-adaptive navigation structure. We introduce a multi-faceted framework to steer the random walk: (1) Symbolic Anchoring, which injects weak entity constraints to regularize the random walk; (2) Query-Aware Dynamic Edge Weighting, which dynamically modulates graph structure, to prune irrelevant paths while amplifying those aligned with the query's intent; and (3) Key-Fact Passage Weight Enhancement, a cost-efficient bias that structurally anchors the random walk to likely evidence. Experiments across four multi-hop benchmarks demonstrate that CatRAG consistently outperforms state of the art baselines. Our analysis reveals that while standard Recall metrics show modest gains, CatRAG achieves substantial improvements in reasoning completeness, the capacity to recover the entire evidence path without gaps. These results reveal that our approach effectively bridges the gap between retrieving partial context and enabling fully grounded reasoning. Resources are available at https://github.com/kwunhang/CatRAG.