LCGuard: 다중 에이전트 시스템에서 안전한 KV 공유를 위한 잠재 통신 가드

초록

대규모 언어 모델(LLM) 기반 다중 에이전트 시스템은 복잡한 작업을 조정하기 위해 중간 통신에 점점 더 의존하고 있다. 대부분의 기존 시스템이 자연어를 통해 통신하는 반면, 최근 연구는 잠재 통신, 특히 트랜스포머 키-값(KV) 캐시를 활용한 통신이 효율성을 향상시키고 작업 관련 정보를 더 풍부하게 보존할 수 있음을 보여준다. 그러나 KV 캐시는 맥락 입력, 중간 추론 상태, 에이전트별 정보를 함께 인코딩하여, 민감한 콘텐츠가 명시적인 텍스트 공개 없이 에이전트 간에 전파될 수 있는 불투명한 채널을 생성한다. 이러한 문제를 해결하기 위해, 우리는 다중 에이전트 LLM 시스템에서 안전한 KV 기반 잠재 통신을 위한 프레임워크인 \textbf{LCGuard}(Latent Communication Guard)를 도입한다. LCGuard는 공유 KV 캐시를 잠재적 작업 메모리로 취급하고, 캐시 아티팩트가 에이전트 간에 전송되기 전에 표현 수준의 변환을 학습한다. 우리는 표현 수준의 민감 정보 누출을 재구성을 통해 조작적으로 정식화한다. 즉, 공유 캐시 아티팩트는 적대적 디코더가 그로부터 에이전트별 민감 입력을 복원할 수 있을 경우 안전하지 않은 것으로 간주된다. 이는 적대자가 민감 입력을 재구성하는 방법을 학습하고, LCGuard는 작업 관련 의미를 보존하면서 재구성 가능한 정보를 줄이는 변환을 학습하는 적대적 학습 공식화로 이어진다. 여러 모델군과 다중 에이전트 벤치마크에 걸친 실증 평가는 LCGuard가 표준 KV 공유 기준선과 비교하여 경쟁력 있는 작업 성능을 유지하면서 재구성 기반 누출과 공격 성공률을 일관되게 감소시킴을 보여준다.

English

Large language model (LLM)-based multi-agent systems increasingly rely on intermediate communication to coordinate complex tasks. While most existing systems communicate through natural language, recent work shows that latent communication, particularly through transformer key-value (KV) caches, can improve efficiency and preserve richer task-relevant information. However, KV caches also encode contextual inputs, intermediate reasoning states, and agent-specific information, creating an opaque channel through which sensitive content may propagate across agents without explicit textual disclosure. To address this, we introduce \textbf{LCGuard} (Latent Communication Guard), a framework for safe KV-based latent communication in multi-agent LLM systems. LCGuard treats shared KV caches as latent working memory and learns representation-level transformations before cache artifacts are transmitted across agents. We formalize representation-level sensitive information leakage operationally through reconstruction: a shared cache artifact is unsafe if an adversarial decoder can recover agent-specific sensitive inputs from it. This leads to an adversarial training formulation in which the adversary learns to reconstruct sensitive inputs, while LCGuard learns transformations that preserve task-relevant semantics and reduce reconstructable information. Empirical evaluations across multiple model families and multi-agent benchmarks show that LCGuard consistently reduces reconstruction-based leakage and attack success rates while maintaining competitive task performance compared to standard KV-sharing baselines.

LCGuard: 다중 에이전트 시스템에서 안전한 KV 공유를 위한 잠재 통신 가드

LCGuard: Latent Communication Guard for Safe KV Sharing in Multi-Agent Systems

초록

Support