智能體該說什麼？：高效多智能體系統的行動狀態通訊

摘要

基於大型語言模型的多智能體系統（MAS）通常圍繞角色、流程和輪替排程進行組織，而智能體之間傳遞的內容往往被保留為未經約束的自然語言。然而，這種自由形式的通訊方式可能迅速膨脹令牌使用量、耗盡共享上下文視窗，最終影響系統效能與推理成本。我們分析了兩種MAS拓撲結構中五種常見的智能體間通訊策略，發現沒有任何一種固定策略普遍最優。相反，有效的智能體間訊息始終能保留下游智能體所需的行動中心資訊。在此基礎上，我們提出PACT（協議化行動狀態通訊與傳輸），將智能體間通訊視為公開狀態更新問題，並在每個原始智能體輸出進入共享歷史之前，將其壓縮為緊湊的行動狀態記錄。在各種MAS拓撲結構下，PACT持續改善效能與成本之間的取捨，以顯著更少的令牌達成相當或更強的任務表現。此效益延伸到生產級編碼框架：PACT使OpenHands的解決率提升，同時每個解決問題的令牌消耗減少10%；對SWE-agent則維持解決率不變，同時輸入令牌減半。我們的程式碼已公開於 https://github.com/iNLP-Lab/PACT。

English

Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form communication can rapidly inflate token usage, consume the shared context window, and ultimately affect both system performance and inference cost. We analyze five common inter-agent communication strategies across two MAS topologies, finding that no fixed strategy is universally optimal. Instead, effective inter-agent messages consistently preserve action-centered information needed by downstream agents. Building on this, we propose the PACT (Protocolized Action-state Communication and Transmission), which treats inter-agent communication as a public state-update problem and projects each raw agent output into a compact action-state record before it enters shared history. Across different MAS topologies, PACT consistently improves the performance-cost trade-off, achieving comparable or stronger task performance with substantially fewer tokens. The gains extend to production coding harnesses: PACT lifts OpenHands' resolve rate at -10% tokens-per-resolved, and is resolve-neutral on SWE-agent while halving input tokens. Our code is publicly available at https://github.com/iNLP-Lab/PACT.