LACUNA：作为递归程序洞的安全智能体

摘要

大型语言模型（LLM）代理日益通过编写代码来行动，但驱动代理的运行时与模型所编写代码之间仍存在割裂。运行时掌控着循环逻辑、上下文和控制流，而模型对此几乎没有发言权。若让模型编写的代码能够塑造运行时本身，虽能增强代理的表达能力，但也会加剧安全问题。模型可能因提示注入而偏离预期、调用错误工具，或中途失败导致状态不一致——当代码能够塑造运行时，此类故障的后果远比代码仅表达单个动作时更为严重。我们提出LACUNA，一种在保障安全性的前提下弥合这一割裂的代理编程模型。每个代理动作均为类型化调用agent[T](task)，当执行到达该调用时，由LLM填入代码，且代码在运行前会针对周围程序进行类型检查。由于每个动作以整体方式被接受或拒绝，被拒绝的动作不会改变环境状态，其编译器诊断信息会驱动重试；同一检查机制还会限定动作可使用的工具、数据及其数据流。我们的原语可将ReAct循环、子代理、技能、并行分解及多模型规划表达为普通控制流。我们在测试用例集、BrowseComp-Plus和τ^2-bench上对LACUNA进行了评估。在BrowseComp-Plus上，8.6%的生成结果在执行前被拒绝，每次查询平均重试0.7次，代理达到27.1%的准确率。在τ^2-bench上，LACUNA凭借高效模型解决了四个领域392个任务中的76.0%，与基准代理表现相当。

English

LLM agents increasingly act by writing code, yet a split persists between the runtime that drives the agent and the code the model writes. The runtime owns the loop, context, and control flow, and the model has little say over any of them. Letting model-written code shape the runtime itself would make agents more expressive, but it would also sharpen safety problems. A model can be diverted by a prompt injection, call the wrong tool, or fail partway and leave an inconsistent state, and each such failure reaches further when the code shapes the runtime than when it expresses a single action. We present LACUNA, a programming model for agents that closes this split while preserving safety. Each agent action is a typed call agent[T](task) that the LLM fills with code when execution reaches it, and the code is type-checked against the surrounding program before it runs. Because each action is accepted or rejected as a whole, a rejected one leaves the environment untouched, and its compiler diagnostics drive a retry. The same check also bounds which tools and data an action may use and how they flow. Our primitive expresses ReAct loops, sub-agents, skills, parallel decomposition, and multi-model planning as ordinary control flow. We evaluate LACUNA on a collection of test cases, BrowseComp-Plus, and τ^2-bench. On BrowseComp-Plus, 8.6% of generations are rejected before execution, with 0.7 retries per query on average, and the agent reaches 27.1% accuracy. On τ^2-bench, LACUNA solves 76.0% of 392 tasks across four domains with a capable model, on par with the baseline agent.