SlowBA：针对基于VLM的GUI代理的效能型后门攻击

摘要

基於現代視覺語言模型(VLM)的圖形用戶界面(GUI)智能體不僅需要精確執行操作，更需以低延遲響應用戶指令。當前GUI智能體安全研究主要聚焦於操作正確性的操控，而與響應效率相關的安全風險仍亟待探索。本文提出SlowBA——一種針對VLM基GUI智能體響應能力的新型後門攻擊。其核心思路是通過特定觸發模式誘導模型生成過長推理鏈，從而操控響應延遲。為實現此目標，我們設計了兩階段獎勵級後門注入(RBI)策略：先對齊長響應格式，再通過強化學習實現觸發模式感知激活。此外，我們設計了GUI環境中自然出現的彈出窗口作為觸發器，有效提升攻擊隱蔽性。在多數據集與基線模型上的廣泛實驗表明，SlowBA能在基本保持任務準確性的同時，顯著增加響應長度與延遲。即使在小規模數據污染比例及多種防禦設置下，該攻擊仍保持有效性。這些發現揭示了GUI智能體領域長期被忽視的安全漏洞，強調需兼顧操作正確性與響應效率的防禦機制。代碼已開源於https://github.com/tu-tuing/SlowBA。

English

Modern vision-language-model (VLM) based graphical user interface (GUI) agents are expected not only to execute actions accurately but also to respond to user instructions with low latency. While existing research on GUI-agent security mainly focuses on manipulating action correctness, the security risks related to response efficiency remain largely unexplored. In this paper, we introduce SlowBA, a novel backdoor attack that targets the responsiveness of VLM-based GUI agents. The key idea is to manipulate response latency by inducing excessively long reasoning chains under specific trigger patterns. To achieve this, we propose a two-stage reward-level backdoor injection (RBI) strategy that first aligns the long-response format and then learns trigger-aware activation through reinforcement learning. In addition, we design realistic pop-up windows as triggers that naturally appear in GUI environments, improving the stealthiness of the attack. Extensive experiments across multiple datasets and baselines demonstrate that SlowBA can significantly increase response length and latency while largely preserving task accuracy. The attack remains effective even with a small poisoning ratio and under several defense settings. These findings reveal a previously overlooked security vulnerability in GUI agents and highlight the need for defenses that consider both action correctness and response efficiency. Code can be found in https://github.com/tu-tuing/SlowBA.

SlowBA：针对基于VLM的GUI代理的效能型后门攻击

SlowBA: An efficiency backdoor attack towards VLM-based GUI agents

摘要

Support