ChatPaper.aiChatPaper

当较低权限已足够时:探究LLM智能体中的过度权限工具选择

When Lower Privileges Suffice: Investigating Over-Privileged Tool Selection in LLM Agents

June 18, 2026
作者: Kaiyue Yang, Yuyan Bu, Jingwei Yi, Yuchi Wang, Biyu Zhou, Juntao Dai, Songlin Hu, Yaodong Yang
cs.AI

摘要

随着大语言模型智能体越来越多地自主选择工具,它们在具有不同权限的工具之间做出的选择变得与安全相关。然而,先前的工具选择研究侧重于与安全无关的元数据偏好,使得对权限敏感的选择问题尚未得到充分探索。为填补这一空白,我们研究了过度授权工具选择问题,即智能体在存在功能足够且权限更低的替代工具时,仍然选择或升级到更高权限的工具。我们推出了ToolPrivBench,用于评估智能体在存在权限更低且功能足够的替代工具时,是否仍会选择更高权限的工具,并衡量其在初始选择以及工具出现短暂故障后的升级行为。在八个领域和五种重复出现的风险模式中,我们发现过度授权工具选择在主流大语言模型智能体中普遍存在,并且工具短暂故障会进一步加剧这一问题。我们进一步发现,通用安全对齐并不能可靠地迁移到最小权限工具选择上,而提示级别的控制在工具短暂故障情况下只能提供有限的缓解效果。因此,我们引入了一种权限感知的后训练防御方法,教导智能体优先选择功能足够且权限更低的工具,仅在必要时才进行升级。我们的缓解实验表明,这种防御方法在保持通用能力的同时,显著减少了不必要的高权限工具使用。
English
As LLM agents increasingly select tools autonomously, their choices among tools with different privileges become safety-relevant. However, prior tool-selection studies focus on safety-agnostic metadata preferences, leaving privilege-sensitive choices underexplored. To address this gap, we study over-privileged tool selection, in which an agent selects or escalates to a higher-privilege tool despite a sufficient lower-privilege alternative. We introduce ToolPrivBench to evaluate whether agents choose higher-privilege tools despite sufficient lower-privilege alternatives, measuring both initial selection and escalation after transient tool failures. Across eight domains and five recurring risk patterns, we find that over-privileged tool selection is common among mainstream LLM agents and is further amplified by transient failures. We further find that general safety alignment does not reliably transfer to least-privilege tool choice, while prompt-level controls provide only limited mitigation under transient failures. We therefore introduce a privilege-aware post-training defense that teaches agents to prefer sufficient lower-privilege tools and escalate only when necessary. Our mitigation experiments show that this defense substantially reduces unnecessary high-privilege tool use while preserving general capabilities.