GraphTracer：基于图引导的大模型代理故障追踪技术，实现稳健的多轮深度搜索

摘要

基于大型语言模型的多智能体系统在复杂任务中通过协同合作表现出色，但在多轮深度搜索场景中却面临较高的失败率。现有的时序归因方法难以准确诊断根本原因，尤其是在错误跨多个智能体传播时。通过分析动作序列来自动化故障归因的尝试仍然效果不佳，因为它们无法考虑跨智能体的信息依赖关系。本文识别出两个核心挑战：(i) 在多智能体错误传播中区分症状与根本原因，以及 (ii) 追踪超越时序顺序的信息依赖关系。为解决这些问题，我们引入了GraphTracer框架，该框架通过信息流分析重新定义了故障归因。GraphTracer构建信息依赖图（IDGs），明确捕捉智能体如何引用并基于先前的输出进行构建。它通过追踪这些依赖结构而非依赖时序序列来定位根本原因。GraphTracer还利用图感知的合成数据生成技术，针对关键节点创建真实的故障场景。在Who&When基准测试中的评估及在生产系统中的集成表明，GraphTracer-8B相比最先进模型实现了高达18.18%的归因准确率提升，并在部署的多智能体框架中带来了4.8%至14.2%的性能改进，为多智能体系统调试提供了一个稳健的解决方案。

English

Multi-agent systems powered by Large Language Models excel at complex tasks through coordinated collaboration, yet they face high failure rates in multi-turn deep search scenarios. Existing temporal attribution methods struggle to accurately diagnose root causes, particularly when errors propagate across multiple agents. Attempts to automate failure attribution by analyzing action sequences remain ineffective due to their inability to account for information dependencies that span agents. This paper identifies two core challenges: (i) distinguishing symptoms from root causes in multi-agent error propagation, and (ii) tracing information dependencies beyond temporal order. To address these issues, we introduce GraphTracer, a framework that redefines failure attribution through information flow analysis. GraphTracer constructs Information Dependency Graphs (IDGs) to explicitly capture how agents reference and build on prior outputs. It localizes root causes by tracing through these dependency structures instead of relying on temporal sequences. GraphTracer also uses graph-aware synthetic data generation to target critical nodes, creating realistic failure scenarios. Evaluations on the Who\&When benchmark and integration into production systems demonstrate that GraphTracer-8B achieves up to 18.18\% higher attribution accuracy compared to state-of-the-art models and enables 4.8\% to 14.2\% performance improvements in deployed multi-agent frameworks, establishing a robust solution for multi-agent system debugging.

GraphTracer：基于图引导的大模型代理故障追踪技术，实现稳健的多轮深度搜索

GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search

摘要

Support