LACUNA: 재귀적 프로그램 홀로서의 안전한 에이전트

초록

LLM 에이전트는 점점 더 코드를 작성하여 작업을 수행하지만, 에이전트를 구동하는 런타임과 모델이 작성하는 코드 사이에는 여전히 분리가 존재한다. 런타임은 루프, 맥락, 제어 흐름을 소유하며, 모델은 이들 중 어느 것에 대해서도 거의 발언권이 없다. 모델이 작성한 코드가 런타임 자체를 형성하도록 허용하면 에이전트는 더욱 표현력이 풍부해지지만, 동시에 안전 문제도 더욱 심각해진다. 모델은 프롬프트 인젝션에 의해 전환되거나, 잘못된 도구를 호출하거나, 중간에 실패하여 일관되지 않은 상태를 남길 수 있으며, 이러한 각각의 실패는 코드가 단일 동작을 표현할 때보다 코드가 런타임을 형성할 때 더 큰 파급 효과를 낳는다. 본 논문에서는 이러한 분리를 해소하면서도 안전성을 유지하는 에이전트용 프로그래밍 모델인 LACUNA를 제시한다. 각 에이전트 동작은 타입이 지정된 호출 `agent[T](task)`이며, 실행이 해당 지점에 도달하면 LLM이 코드로 채우고, 코드는 실행 전에 주변 프로그램에 대해 타입 검사를 받는다. 각 동작은 전체로서 승인되거나 거부되기 때문에, 거부된 동작은 환경을 변경하지 않은 상태로 두며, 컴파일러 진단 정보는 재시도를 유도한다. 동일한 검사는 동작이 사용할 수 있는 도구와 데이터의 범위 및 그 흐름 방식도 제한한다. 우리의 프리미티브는 ReAct 루프, 하위 에이전트, 스킬, 병렬 분해, 멀티모델 계획을 일반적인 제어 흐름으로 표현한다. 우리는 LACUNA를 BrowseComp-Plus 및 τ²-bench 테스트 사례 모음에서 평가한다. BrowseComp-Plus에서는 생성의 8.6%가 실행 전에 거부되며, 쿼리당 평균 0.7회의 재시도가 발생하고, 에이전트는 27.1%의 정확도를 달성한다. τ²-bench에서 LACUNA는 우수한 모델을 사용하여 네 가지 도메인에 걸친 392개 작업 중 76.0%를 해결하며, 이는 기준 에이전트와 동등한 수준이다.

English

LLM agents increasingly act by writing code, yet a split persists between the runtime that drives the agent and the code the model writes. The runtime owns the loop, context, and control flow, and the model has little say over any of them. Letting model-written code shape the runtime itself would make agents more expressive, but it would also sharpen safety problems. A model can be diverted by a prompt injection, call the wrong tool, or fail partway and leave an inconsistent state, and each such failure reaches further when the code shapes the runtime than when it expresses a single action. We present LACUNA, a programming model for agents that closes this split while preserving safety. Each agent action is a typed call agent[T](task) that the LLM fills with code when execution reaches it, and the code is type-checked against the surrounding program before it runs. Because each action is accepted or rejected as a whole, a rejected one leaves the environment untouched, and its compiler diagnostics drive a retry. The same check also bounds which tools and data an action may use and how they flow. Our primitive expresses ReAct loops, sub-agents, skills, parallel decomposition, and multi-model planning as ordinary control flow. We evaluate LACUNA on a collection of test cases, BrowseComp-Plus, and τ^2-bench. On BrowseComp-Plus, 8.6% of generations are rejected before execution, with 0.7 retries per query on average, and the agent reaches 27.1% accuracy. On τ^2-bench, LACUNA solves 76.0% of 392 tasks across four domains with a capable model, on par with the baseline agent.