為何網路AI代理比獨立大型語言模型更脆弱?一項安全性分析
Why Are Web AI Agents More Vulnerable Than Standalone LLMs? A Security Analysis
February 27, 2025
作者: Jeffrey Yang Fan Chiang, Seungjae Lee, Jia-Bin Huang, Furong Huang, Yizheng Chen
cs.AI
摘要
近期,Web AI 代理在處理複雜的網頁導航任務方面展現了顯著的能力。然而,新興研究顯示,儘管這些代理與獨立的大型語言模型(LLMs)均基於相同的安全對齊模型構建,但前者表現出更高的脆弱性。這一差異尤其令人擔憂,因為相較於獨立的 LLMs,Web AI 代理具有更大的靈活性,這可能使其暴露於更廣泛的對抗性用戶輸入中。為構建一個應對這些問題的框架,本研究探討了導致 Web AI 代理脆弱性增加的潛在因素。值得注意的是,這種差異源於 Web AI 代理與獨立 LLMs 之間的多方面差異,以及複雜的信號——這些細微之處往往是簡單的評估指標(如成功率)所無法捕捉的。為應對這些挑戰,我們提出了組件層面的分析和一個更細緻、系統化的評估框架。通過這種精細化的調查,我們識別出三個加劇 Web AI 代理脆弱性的關鍵因素:(1) 將用戶目標嵌入系統提示中,(2) 多步驟動作生成,以及 (3) 觀察能力。我們的研究結果強調了在 AI 代理設計中增強安全性和魯棒性的迫切需求,並為有針對性的防禦策略提供了可操作的見解。
English
Recent advancements in Web AI agents have demonstrated remarkable
capabilities in addressing complex web navigation tasks. However, emerging
research shows that these agents exhibit greater vulnerability compared to
standalone Large Language Models (LLMs), despite both being built upon the same
safety-aligned models. This discrepancy is particularly concerning given the
greater flexibility of Web AI Agent compared to standalone LLMs, which may
expose them to a wider range of adversarial user inputs. To build a scaffold
that addresses these concerns, this study investigates the underlying factors
that contribute to the increased vulnerability of Web AI agents. Notably, this
disparity stems from the multifaceted differences between Web AI agents and
standalone LLMs, as well as the complex signals - nuances that simple
evaluation metrics, such as success rate, often fail to capture. To tackle
these challenges, we propose a component-level analysis and a more granular,
systematic evaluation framework. Through this fine-grained investigation, we
identify three critical factors that amplify the vulnerability of Web AI
agents; (1) embedding user goals into the system prompt, (2) multi-step action
generation, and (3) observational capabilities. Our findings highlights the
pressing need to enhance security and robustness in AI agent design and provide
actionable insights for targeted defense strategies.Summary
AI-Generated Summary