Was sollten Agenten sagen? Aktions-Zustands-Kommunikation für effiziente Multi-Agenten-Systeme

Zusammenfassung

Multi-Agenten-Systeme (MAS), die auf großen Sprachmodellen basieren, sind typischerweise um Rollen, Pipelines und Turn-Zeitpläne herum organisiert, während die Inhalte, die Agenten einander übermitteln, oft als uneingeschränkte natürliche Sprache belassen werden. Diese freie Kommunikation kann jedoch den Token-Verbrauch schnell in die Höhe treiben, das gemeinsame Kontextfenster beanspruchen und letztlich sowohl die Systemleistung als auch die Inferenzkosten beeinträchtigen. Wir analysieren fünf gängige Strategien zur Kommunikation zwischen Agenten in zwei MAS-Topologien und stellen fest, dass keine feste Strategie universell optimal ist. Stattdessen enthalten effektive Nachrichten zwischen Agenten durchgängig handlungszentrierte Informationen, die von nachgelagerten Agenten benötigt werden. Darauf aufbauend schlagen wir PACT (Protocolized Action-state Communication and Transmission – protokollierte Aktions-Zustands-Kommunikation und -Übertragung) vor, das die Kommunikation zwischen Agenten als ein öffentliches Zustandsaktualisierungsproblem behandelt und jede rohe Agentenausgabe in einen kompakten Aktions-Zustands-Datensatz projiziert, bevor sie in den gemeinsamen Verlauf eingeht. Über verschiedene MAS-Topologien hinweg verbessert PACT durchgängig das Verhältnis von Leistung zu Kosten, wobei eine vergleichbare oder stärkere Aufgabenerfüllung mit deutlich weniger Token erzielt wird. Die Vorteile erstrecken sich auch auf Produktions-Coding-Tools: PACT steigert die Lösungsrate von OpenHands bei –10 % Token pro gelöster Aufgabe und bleibt auf SWE-agent lösungsneutral, während die Eingabe-Token halbiert werden. Unser Code ist öffentlich verfügbar unter https://github.com/iNLP-Lab/PACT.

English

Multi-agent systems (MAS) built on large language models are typically organized around roles, pipelines, and turn schedules, while the content that agents pass to one another is often left as unconstrained natural language. However, this free-form communication can rapidly inflate token usage, consume the shared context window, and ultimately affect both system performance and inference cost. We analyze five common inter-agent communication strategies across two MAS topologies, finding that no fixed strategy is universally optimal. Instead, effective inter-agent messages consistently preserve action-centered information needed by downstream agents. Building on this, we propose the PACT (Protocolized Action-state Communication and Transmission), which treats inter-agent communication as a public state-update problem and projects each raw agent output into a compact action-state record before it enters shared history. Across different MAS topologies, PACT consistently improves the performance-cost trade-off, achieving comparable or stronger task performance with substantially fewer tokens. The gains extend to production coding harnesses: PACT lifts OpenHands' resolve rate at -10% tokens-per-resolved, and is resolve-neutral on SWE-agent while halving input tokens. Our code is publicly available at https://github.com/iNLP-Lab/PACT.