LACUNA: 安全なエージェントとしての再帰的プログラムホール

要旨

LLMエージェントはコードを記述することで動作するケースが増えているが、エージェントを駆動するランタイムとモデルが記述するコードとの間には依然として分断が存在する。ランタイムはループ、コンテキスト、制御フローを所有しており、モデルはこれらに対してほとんど影響を及ぼせない。モデルが記述したコードがランタイム自体を形成できるようにすれば、エージェントはより表現力豊かになるが、同時に安全性の問題も深刻化する。モデルはプロンプトインジェクションによって誘導されたり、誤ったツールを呼び出したり、途中で失敗して不整合な状態を残したりする可能性があり、コードがランタイムを形成する場合、こうした障害は単一のアクションを表現する場合よりも影響が大きくなる。本稿では、この分断を解消しつつ安全性を維持するエージェント向けプログラミングモデルであるLACUNAを提案する。各エージェントアクションは型付き呼び出しagent[T](task)であり、実行がその箇所に到達した際にLLMがコードで埋め、そのコードは実行前に周囲のプログラムに対して型チェックを受ける。各アクションは全体として受け入れられるか拒否されるため、拒否されたアクションは環境に影響を残さず、コンパイラの診断結果がリトライを駆動する。また、同じチェックによって、アクションが使用できるツールやデータ、それらの流れも制限される。このプリミティブは、ReActループ、サブエージェント、スキル、並列分解、マルチモデル計画などを通常の制御フローとして表現する。LACUNAを、テストケース群、BrowseComp-Plus、τ^2-benchで評価した。BrowseComp-Plusでは、生成結果の8.6%が実行前に拒否され、クエリあたり平均0.7回のリトライが発生し、エージェントは27.1%の精度に達した。τ^2-benchでは、LACUNAは高性能モデルを用いて4ドメインにわたる392タスクの76.0%を解決し、ベースラインエージェントと同等の結果を示した。

English

LLM agents increasingly act by writing code, yet a split persists between the runtime that drives the agent and the code the model writes. The runtime owns the loop, context, and control flow, and the model has little say over any of them. Letting model-written code shape the runtime itself would make agents more expressive, but it would also sharpen safety problems. A model can be diverted by a prompt injection, call the wrong tool, or fail partway and leave an inconsistent state, and each such failure reaches further when the code shapes the runtime than when it expresses a single action. We present LACUNA, a programming model for agents that closes this split while preserving safety. Each agent action is a typed call agent[T](task) that the LLM fills with code when execution reaches it, and the code is type-checked against the surrounding program before it runs. Because each action is accepted or rejected as a whole, a rejected one leaves the environment untouched, and its compiler diagnostics drive a retry. The same check also bounds which tools and data an action may use and how they flow. Our primitive expresses ReAct loops, sub-agents, skills, parallel decomposition, and multi-model planning as ordinary control flow. We evaluate LACUNA on a collection of test cases, BrowseComp-Plus, and τ^2-bench. On BrowseComp-Plus, 8.6% of generations are rejected before execution, with 0.7 retries per query on average, and the agent reaches 27.1% accuracy. On τ^2-bench, LACUNA solves 76.0% of 392 tasks across four domains with a capable model, on par with the baseline agent.