신뢰할 수 있는 GUI 에이전트를 향하여: 연구 동향 분석

초록

대형 기반 모델로 구동되는 GUI 에이전트는 디지털 인터페이스와 상호작용할 수 있어 웹 자동화, 모바일 탐색, 소프트웨어 테스트 등 다양한 응용 분야에서 활용되고 있습니다. 그러나 이들의 점점 증가하는 자율성은 보안, 프라이버시, 안전성에 대한 중요한 우려를 불러일으키고 있습니다. 본 조사는 GUI 에이전트의 신뢰성을 다섯 가지 주요 차원에서 검토합니다: 보안 취약점, 동적 환경에서의 신뢰성, 투명성과 설명 가능성, 윤적 고려 사항, 그리고 평가 방법론. 또한 적대적 공격에 대한 취약성, 순차적 의사결정에서의 연쇄적 실패 모드, 현실적인 평가 벤치마크의 부재와 같은 주요 과제를 식별합니다. 이러한 문제들은 실제 배포를 방해할 뿐만 아니라 작업 성공을 넘어선 포괄적인 완화 전략을 요구합니다. GUI 에이전트가 더욱 보편화됨에 따라, 견고한 안전 기준과 책임 있는 개발 관행을 확립하는 것이 필수적입니다. 본 조사는 체계적인 이해와 향후 연구를 통해 신뢰할 수 있는 GUI 에이전트를 발전시키기 위한 기반을 제공합니다.

English

GUI agents, powered by large foundation models, can interact with digital interfaces, enabling various applications in web automation, mobile navigation, and software testing. However, their increasing autonomy has raised critical concerns about their security, privacy, and safety. This survey examines the trustworthiness of GUI agents in five critical dimensions: security vulnerabilities, reliability in dynamic environments, transparency and explainability, ethical considerations, and evaluation methodologies. We also identify major challenges such as vulnerability to adversarial attacks, cascading failure modes in sequential decision-making, and a lack of realistic evaluation benchmarks. These issues not only hinder real-world deployment but also call for comprehensive mitigation strategies beyond task success. As GUI agents become more widespread, establishing robust safety standards and responsible development practices is essential. This survey provides a foundation for advancing trustworthy GUI agents through systematic understanding and future research.