arXiv: 2605.22786v1

LCGuard:多智能体系统中安全KV共享的隐式通信守卫

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

May 21, 2026
作者: Sadia Asif, Mohammad Mohammadi Amiri, Momin Abbas, Prasanna Sattigeri, Karthikeyan Natesan Ramamurthy
cs.AIcs.AIcs.ETcs.LGcs.MAcs.AI

摘要

基於大型語言模型(LLM)的多智能體系統日益依賴中間通信來協調複雜任務。雖然現有系統大多透過自然語言進行通信,但近期研究顯示,潛在通信(特別是透過Transformer鍵值(KV)快取)能提升效率並保留更豐富的任務相關資訊。然而,KV快取同時編碼了上下文輸入、中間推理狀態及智能體特定資訊,形成一個不透明的通道,使敏感內容可能在無明確文字揭露的情況下於智能體間傳播。為解決此問題,我們提出**LCGuard**(潛在通信防護罩),這是一個針對多智能體LLM系統中安全KV潛在通信的框架。LCGuard將共享的KV快取視為潛在工作記憶,並在快取工件於智能體間傳遞前,學習表示層級的轉換。我們透過重構操作形式化定義表示層級的敏感資訊洩漏:若共享的快取工件能使對抗式解碼器還原出智能體特定的敏感輸入,則該工件不安全。這導向一個對抗訓練架構,其中對抗者學習重構敏感輸入,而LCGuard學習能保留任務相關語義並減少可重構資訊的轉換。跨多個模型家族及多智能體基準的實證評估顯示,與標準KV共享基線相比,LCGuard能持續降低基於重構的洩漏與攻擊成功率,同時維持具競爭力的任務表現。
English
Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.
PDFMay 22, 2026