LedgerAgent: ポリシー準拠のツール呼び出しエージェントのための構造化状態

要旨

カスタマーサービス領域におけるポリシー準拠型ツール呼び出しエージェントは、ターン間でタスク状態を維持しつつツールを呼び出し、ドメインポリシーに従わなければならない。タスク状態は、ユーザーとの対話やツール呼び出しを通じて観測される関連事実、識別子、制約、条件から構成される。標準的なエージェントでは、タスク状態は別途表現されない。観測結果、ツールの戻り値、ポリシー指示はプロンプトに配置され、エージェントは次に何を行うかを決定するたびに、プロンプトから関連状態を再構築する必要がある。この設計は状態管理を暗黙的にし、二つの一般的な障害モードを引き起こす。エージェントが正しい事実を取得しても、後にその意思決定を古い、欠落した、または誤った情報に基づいて行う可能性がある。また、構文的に正しいツール呼び出しであっても、現在のタスク状態に依存するドメインポリシーに違反する場合がある。本稿では、LedgerAgentを提案する。これは、ツール呼び出しエージェントのための推論時手法であり、観測されたタスク状態を別の台帳に保持し、その状態をプロンプトにレンダリングする。また、環境を変更するツール呼び出しが実行される前に、台帳を用いて状態依存のポリシー制約をチェックし、ポリシー違反を防止する。カスタマーサービスの4つのドメイン、ならびにオープンウェイトモデルとクローズドウェイトモデルの混合パネルにおいて、LedgerAgentは標準的なプロンプトベースのツール呼び出し手法よりも平均passkを改善し、特に厳格な複数試行一貫性指標のもとで最大の向上を示した。

English

Policy-adherent tool-calling agents in customer-service domains must maintain task states across turns while calling tools and obeying domain policies. Task states consist of relevant facts, identifiers, constraints, and conditions observed through user interaction and tool calls. In standard agents, task states are not represented separately. Observations, tool returns, and policy instructions are placed in the prompt, leaving agents to reconstruct the relevant states from the prompt each time they decide what to do next. This design makes state management implicit, creating two common failure modes. An agent may retrieve the right facts but later ground its decision in stale, missing, or incorrect information; and a syntactically valid tool call may still violate a domain policy that depends on the current task state. We introduce LedgerAgent, an inference-time method for tool-calling agents that maintains observed task states in a separate ledger and renders the states into the prompt. The ledger is also used to check state-dependent policy constraints before environment-changing tool calls are executed, blocking policy violations. Across four customer-service domains and a mixed panel of open- and closed-weight models, LedgerAgent improves average passk over a standard prompt-based tool-calling approach, with the largest gains under stricter multi-trial consistency metrics.