GraphTracer：基於圖引導的大型語言模型代理故障追蹤技術，實現穩健的多輪深度搜索

摘要

基於大型語言模型的多智能體系統在複雜任務中通過協調合作表現出色，但在多輪深度搜索場景中卻面臨高失敗率。現有的時間歸因方法難以準確診斷根本原因，尤其是在錯誤在多個智能體之間傳播的情況下。通過分析動作序列來自動化失敗歸因的嘗試仍然無效，因為這些方法無法考慮跨智能體的信息依賴性。本文識別了兩個核心挑戰：(i) 在多智能體錯誤傳播中區分症狀與根本原因，以及 (ii) 追蹤超越時間順序的信息依賴性。為解決這些問題，我們引入了GraphTracer，這是一個通過信息流分析重新定義失敗歸因的框架。GraphTracer構建信息依賴圖（IDGs）來明確捕捉智能體如何引用和基於先前的輸出。它通過追蹤這些依賴結構來定位根本原因，而不是依賴於時間序列。GraphTracer還使用圖感知的合成數據生成來針對關鍵節點，創建真實的失敗場景。在Who\&When基準上的評估以及在生產系統中的集成表明，GraphTracer-8B相比最先進的模型，歸因準確率提高了高達18.18%，並在部署的多智能體框架中實現了4.8%到14.2%的性能提升，為多智能體系統調試提供了一個強大的解決方案。

English

Multi-agent systems powered by Large Language Models excel at complex tasks through coordinated collaboration, yet they face high failure rates in multi-turn deep search scenarios. Existing temporal attribution methods struggle to accurately diagnose root causes, particularly when errors propagate across multiple agents. Attempts to automate failure attribution by analyzing action sequences remain ineffective due to their inability to account for information dependencies that span agents. This paper identifies two core challenges: (i) distinguishing symptoms from root causes in multi-agent error propagation, and (ii) tracing information dependencies beyond temporal order. To address these issues, we introduce GraphTracer, a framework that redefines failure attribution through information flow analysis. GraphTracer constructs Information Dependency Graphs (IDGs) to explicitly capture how agents reference and build on prior outputs. It localizes root causes by tracing through these dependency structures instead of relying on temporal sequences. GraphTracer also uses graph-aware synthetic data generation to target critical nodes, creating realistic failure scenarios. Evaluations on the Who\&When benchmark and integration into production systems demonstrate that GraphTracer-8B achieves up to 18.18\% higher attribution accuracy compared to state-of-the-art models and enables 4.8\% to 14.2\% performance improvements in deployed multi-agent frameworks, establishing a robust solution for multi-agent system debugging.

GraphTracer：基於圖引導的大型語言模型代理故障追蹤技術，實現穩健的多輪深度搜索

GraphTracer: Graph-Guided Failure Tracing in LLM Agents for Robust Multi-Turn Deep Search

摘要

Support