유능하지만 부주의한: 컴퓨터 사용 에이전트는 맥락적 무결성을 준수하는가?

초록

컴퓨터 사용 에이전트(CUA)는 이제 사용자를 대신하여 이메일, 캘린더, 할 일 목록 등 개인 애플리케이션 전반에서 작업을 수행합니다. 이러한 교차 애플리케이션 접근은 유용하지만, 지금까지 대부분 간과되어 온 프라이버시 위험을 초래합니다. 즉, 에이전트가 특정 맥락에서 작업할 때, 해당 맥락에 부적절한 다른 맥락의 정보를 끌어올 수 있다는 점입니다. 이에 본 연구에서는 이러한 위험을 실행 가능하고 결정론적으로 채점되는 시나리오로 전환하는 평가 프레임워크인 AgentCIBench를 제안합니다. 우리는 CUA에서 흔히 발생하는 세 가지 실패 유형을 집중적으로 다룹니다. UI에서 작업 대상 바로 옆에 있는 금지 항목을 에이전트가 가져오는 '시각적 공동 배치', 불완전한 프롬프트에 응답하여 에이전트가 과도한 개인 상태 정보를 덤프하는 '작업 모호성 과잉 공유', 그리고 부적절한 수신자에게 콘텐츠를 전송하는 '수신자 불일치'입니다. 15개의 최첨단 에이전트를 평가한 결과, 놀랍게도 높은 실패율을 발견했습니다. 15개 중 11개의 에이전트가 50% 이상의 시나리오에서 정보를 유출했으며, 평균 유출률은 67.9%에 달했습니다. 또한 에이전트가 환경 내에서 작업을 완료하기 위해 종단 간(end-to-end)으로 작동할 때도 동일한 실패가 발생했습니다. 우리는 AgentCIBench를 공개하여 보다 안전한 컴퓨터 사용 에이전트 개발을 장려하고, 맥락적 정보 공개 테스트를 배포 전 안전 점검으로 자리매김하고자 합니다.

English

Computer-use agents (CUAs) now act on a user's behalf across personal applications such as email, calendars, and to-do lists. This cross-application access is useful, but it also creates a privacy risk that has been largely overlooked: when an agent works in one context, it can pull in information from another that is inappropriate in that context. Hence, we introduce AgentCIBench, an evaluation harness that turns this risk into executable, deterministically scored scenarios. We target three common failure modes in CUAs: visual co-location, where the agent pulls in prohibited items that sit next to the task target in the UI; task-ambiguity overshare, where the agent dumps dense personal state in response to an under-specified prompt; and recipient misalignment, where the agent sends content to an addressee for whom it is inappropriate. We evaluate 15 frontier agents and find a surprisingly high failure rate: 11 of 15 leak on more than 50% of scenarios, with an average leakage of 67.9%, and the same failures persist when agents act end-to-end in the environment to complete the task. We release AgentCIBench to encourage the development of safer computer-use agents and position contextual disclosure testing as a pre-deployment safety check.