Progent: LLMエージェントのためのプログラム可能な権限制御

要旨

LLMエージェントは、大規模言語モデル（LLM）を中核コンポーネントとして活用し、多様なツールを利用してユーザーから割り当てられたタスクを遂行する新興のAIシステム形態です。その大きな可能性にもかかわらず、LLMエージェントは重大なセキュリティリスクを抱えています。外部世界と相互作用する際、攻撃者からの悪意あるコマンドに遭遇し、危険なアクションを実行してしまう可能性があります。これを解決する有望な方法は、最小権限の原則を適用することです。つまり、タスクの完了に必要なアクションのみを許可し、不必要なアクションをブロックするというものです。しかし、これを実現するのは困難であり、多様なエージェントシナリオをカバーしつつ、セキュリティと有用性の両方を維持する必要があります。私たちは、LLMエージェント向けの最初の権限制御メカニズムであるProgentを紹介します。その中核は、エージェントの実行中に適用される権限制御ポリシーを柔軟に表現するためのドメイン固有言語です。これらのポリシーは、ツール呼び出しに対する細かい制約を提供し、ツール呼び出しが許可されるタイミングを決定し、許可されない場合の代替手段を指定します。これにより、エージェント開発者やユーザーは、特定のユースケースに適したポリシーを作成し、それを確定的に適用してセキュリティを保証することができます。モジュール設計のおかげで、Progentの統合はエージェントの内部を変更せず、エージェントの実装に最小限の変更しか必要としないため、実用性と広範な採用の可能性が高まります。ポリシーの作成を自動化するために、LLMを活用してユーザークエリに基づいてポリシーを生成し、セキュリティと有用性を向上させるために動的に更新します。私たちの広範な評価は、AgentDojo、ASB、AgentPoisonという3つの異なるシナリオまたはベンチマークにおいて、高い有用性を維持しながら強力なセキュリティを実現することを示しています。さらに、詳細な分析を行い、その中核コンポーネントの有効性と、適応型攻撃に対する自動ポリシー生成の耐性を示しています。

English

LLM agents are an emerging form of AI systems where large language models (LLMs) serve as the central component, utilizing a diverse set of tools to complete user-assigned tasks. Despite their great potential, LLM agents pose significant security risks. When interacting with the external world, they may encounter malicious commands from attackers, leading to the execution of dangerous actions. A promising way to address this is by enforcing the principle of least privilege: allowing only essential actions for task completion while blocking unnecessary ones. However, achieving this is challenging, as it requires covering diverse agent scenarios while preserving both security and utility. We introduce Progent, the first privilege control mechanism for LLM agents. At its core is a domain-specific language for flexibly expressing privilege control policies applied during agent execution. These policies provide fine-grained constraints over tool calls, deciding when tool calls are permissible and specifying fallbacks if they are not. This enables agent developers and users to craft suitable policies for their specific use cases and enforce them deterministically to guarantee security. Thanks to its modular design, integrating Progent does not alter agent internals and requires only minimal changes to agent implementation, enhancing its practicality and potential for widespread adoption. To automate policy writing, we leverage LLMs to generate policies based on user queries, which are then updated dynamically for improved security and utility. Our extensive evaluation shows that it enables strong security while preserving high utility across three distinct scenarios or benchmarks: AgentDojo, ASB, and AgentPoison. Furthermore, we perform an in-depth analysis, showcasing the effectiveness of its core components and the resilience of its automated policy generation against adaptive attacks.

Progent: LLMエージェントのためのプログラム可能な権限制御

Progent: Programmable Privilege Control for LLM Agents

要旨

Support