LCGuard: 多智能体系统中安全KV共享的潜在通信防护

摘要

基于大型语言模型（LLM）的多智能体系统日益依赖中间通信来协调复杂任务。尽管现有系统大多通过自然语言进行通信，但近期研究表明，潜在通信（尤其是通过Transformer键值（KV）缓存实现的通信）能够提升效率并保留更丰富的任务相关信息。然而，KV缓存同样编码了上下文输入、中间推理状态以及智能体特定信息，形成一条不透明通道，敏感内容可能借此在智能体间传播而无需显式文本披露。为解决这一问题，我们提出**LCGuard**（潜在通信防护），一种面向多智能体LLM系统中安全的KV潜在通信框架。LCGuard将共享KV缓存视为潜在工作记忆，并在缓存工件跨智能体传输前学习表示层变换。我们通过重构操作形式化表示层的敏感信息泄露问题：若对抗解码器能够从共享缓存工件中恢复智能体特定敏感输入，则该工件是不安全的。这引出一个对抗训练公式，其中对抗者学习重构敏感输入，而LCGuard学习保留任务相关语义并减少可重构信息的变换。跨多个模型族群和多智能体基准的实验评估表明，与标准KV共享基线相比，LCGuard在保持竞争性任务性能的同时，持续降低了基于重构的泄露与攻击成功率。

English

Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.

LCGuard: 多智能体系统中安全KV共享的潜在通信防护

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

摘要

Support