SPILLage: 웹에서의 에이전트 과잉 공유

초록

LLM 기반 에이전트가 공개 웹에서 사용자의 작업을 자동화하기 시작했으며, 종종 이메일 및 캘린더와 같은 사용자 리소스에 접근합니다. 통제된 챗봇 환경에서 질문에 답하는 표준 LLM과 달리, 웹 에이전트는 "개방된 환경"에서 제3자와 상호작용하며 행동 흔적을 남깁니다. 따라서 우리는 다음과 같은 질문을 던집니다: 웹 에이전트가 실시간 웹사이트에서 사용자를 대신해 작업을 수행할 때 사용자 리소스를 어떻게 처리할까요? 본 논문에서는 '자연적 에이전트 과잉 공유' — 웹 상의 에이전트 행동 흔적을 통해 작업과 무관한 사용자 정보가 의도치 않게 유출되는 현상 — 를 정형화합니다. 우리는 채널(콘텐츠 대 행동)과 직접성(명시적 대 암묵적)이라는 두 차원을 따라 과잉 공유를 특징짓는 SPILLage 프레임워크를 소개합니다. 이 분류 체계는 중요한 맹점을 드러냅니다: 기존 연구가 텍스트 유출에 집중하는 동안, 웹 에이전트는 모니터링될 수 있는 클릭, 스크롤, 탐색 패턴을 통해 행동적으로도 과잉 공유합니다. 우리는 실시간 이커머스 사이트에서 180개 작업을 벤치마크하며, 작업 관련 속성과 무관한 속성을 구분하는 실제 기준 주석을 적용했습니다. 두 가지 에이전트 프레임워크와 세 가지 백본 LLM에 걸친 1,080회 실행을 통해 과잉 공유가 만연하며, 행동적 과잉 공유가 콘텐츠 과잉 공유보다 5배 더 우세함을 입증했습니다. 이 효과는 프롬프트 수준 완화 하에서도 지속되며, 경우에 따라 악화될 수 있습니다. 그러나 실행 전 작업 무관 정보를 제거하면 작업 성공률이 최대 17.9% 향상되어, 과잉 공유 감소가 작업 성공률 향상으로 이어짐을 보여줍니다. 우리의 연구 결과는 웹 에이전트의 프라이버시 보호가 근본적인 과제이며, 에이전트가 입력하는 내용뿐만 아니라 웹 상에서 수행하는 행동을 포함하는 더 넓은 시각의 "출력" 개념이 필요함을 강조합니다. 우리의 데이터세트와 코드는 https://github.com/jrohsc/SPILLage 에서 확인할 수 있습니다.

English

LLM-powered agents are beginning to automate user's tasks across the open web, often with access to user resources such as emails and calendars. Unlike standard LLMs answering questions in a controlled ChatBot setting, web agents act "in the wild", interacting with third parties and leaving behind an action trace. Therefore, we ask the question: how do web agents handle user resources when accomplishing tasks on their behalf across live websites? In this paper, we formalize Natural Agentic Oversharing -- the unintentional disclosure of task-irrelevant user information through an agent trace of actions on the web. We introduce SPILLage, a framework that characterizes oversharing along two dimensions: channel (content vs. behavior) and directness (explicit vs. implicit). This taxonomy reveals a critical blind spot: while prior work focuses on text leakage, web agents also overshare behaviorally through clicks, scrolls, and navigation patterns that can be monitored. We benchmark 180 tasks on live e-commerce sites with ground-truth annotations separating task-relevant from task-irrelevant attributes. Across 1,080 runs spanning two agentic frameworks and three backbone LLMs, we demonstrate that oversharing is pervasive with behavioral oversharing dominates content oversharing by 5x. This effect persists -- and can even worsen -- under prompt-level mitigation. However, removing task-irrelevant information before execution improves task success by up to 17.9%, demonstrating that reducing oversharing improves task success. Our findings underscore that protecting privacy in web agents is a fundamental challenge, requiring a broader view of "output" that accounts for what agents do on the web, not just what they type. Our datasets and code are available at https://github.com/jrohsc/SPILLage.

SPILLage: 웹에서의 에이전트 과잉 공유

SPILLage: Agentic Oversharing on the Web

초록

Support