LCGuard：多智能体系统中安全KV共享的隐式通信守卫

摘要

基於大型語言模型（LLM）的多智能體系統日益依賴中間通信來協調複雜任務。雖然現有系統大多透過自然語言進行通信，但近期研究顯示，潛在通信（特別是透過Transformer鍵值（KV）快取）能提升效率並保留更豐富的任務相關資訊。然而，KV快取同時編碼了上下文輸入、中間推理狀態及智能體特定資訊，形成一個不透明的通道，使敏感內容可能在無明確文字揭露的情況下於智能體間傳播。為解決此問題，我們提出**LCGuard**（潛在通信防護罩），這是一個針對多智能體LLM系統中安全KV潛在通信的框架。LCGuard將共享的KV快取視為潛在工作記憶，並在快取工件於智能體間傳遞前，學習表示層級的轉換。我們透過重構操作形式化定義表示層級的敏感資訊洩漏：若共享的快取工件能使對抗式解碼器還原出智能體特定的敏感輸入，則該工件不安全。這導向一個對抗訓練架構，其中對抗者學習重構敏感輸入，而LCGuard學習能保留任務相關語義並減少可重構資訊的轉換。跨多個模型家族及多智能體基準的實證評估顯示，與標準KV共享基線相比，LCGuard能持續降低基於重構的洩漏與攻擊成功率，同時維持具競爭力的任務表現。

English

Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.

LCGuard：多智能体系统中安全KV共享的隐式通信守卫

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

摘要

Support