arXiv: 2605.22786v1
LCGuard: 多智能体系统中安全KV共享的潜在通信防护
LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems
May 21, 2026
作者: Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy
cs.AIcs.AIcs.ETcs.LGcs.MAcs.AI
摘要
基于大型语言模型(LLM)的多智能体系统日益依赖中间通信来协调复杂任务。尽管现有系统大多通过自然语言进行通信,但近期研究表明,潜在通信(尤其是通过Transformer键值(KV)缓存实现的通信)能够提升效率并保留更丰富的任务相关信息。然而,KV缓存同样编码了上下文输入、中间推理状态以及智能体特定信息,形成一条不透明通道,敏感内容可能借此在智能体间传播而无需显式文本披露。为解决这一问题,我们提出**LCGuard**(潜在通信防护),一种面向多智能体LLM系统中安全的KV潜在通信框架。LCGuard将共享KV缓存视为潜在工作记忆,并在缓存工件跨智能体传输前学习表示层变换。我们通过重构操作形式化表示层的敏感信息泄露问题:若对抗解码器能够从共享缓存工件中恢复智能体特定敏感输入,则该工件是不安全的。这引出一个对抗训练公式,其中对抗者学习重构敏感输入,而LCGuard学习保留任务相关语义并减少可重构信息的变换。跨多个模型族群和多智能体基准的实验评估表明,与标准KV共享基线相比,LCGuard在保持竞争性任务性能的同时,持续降低了基于重构的泄露与攻击成功率。
English
Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.