Agent libOS：一種受函式庫作業系統啟發的執行環境，適用於長時間運行、能力受控的LLM智能體

摘要

大型語言模型代理正從請求-回應助手演變成長期運行的軟體執行個體：它們在模型調用之間維持狀態，分派子任務、等待外部事件、請求人類授權、生成工具、執行必須能被恢復與稽核的副作用。本論文提出 Agent libOS——一種受程式庫作業系統啟發的LLM代理執行時基礎設施。Agent libOS運行於傳統主機作業系統之上；它不實作硬體驅動程式、核心態隔離或POSIX相容作業系統。相反地，它將代理視為一個AgentProcess：一個可排程的執行主體，具備程序身分、父子血緣關係、生命週期狀態、從AgentImage衍生出的工具表、型別化物件記憶體、明確能力、人類佇列、檢查點、事件與稽核記錄。其核心設計原則為：工具是類似libc的包裝器；執行時原語則構成權限邊界。檔案系統存取、物件存取、休眠、人類核准、即時工具註冊以及外部副作用，皆在明確能力與策略規範下，於原語邊界進行檢查。我們描述了該系統的設計、威脅模型、Python原型以及以安全為導向的評估。當前原型實作了非同步排程、命名空間局部的物件記憶體、執行時整合的人類核准、一次性權限授予、每程序工作目錄、Shell與映像註冊原語、基於libOS系統呼叫中介的Deno/TypeScript即時工具、檔案系統/物件橋接工具、可注入的資源提供者基礎設施、確定性演示、真實模型煙霧測試腳本，以及截至撰寫時共123項迴歸測試。Agent libOS並非旨在提升規劃器準確度，而是展示一種執行時基礎設施，在此基礎上，長期運行的LLM代理得以被排程、授權、恢復與稽核，無須將工具派送視為信任邊界。

English

Large language model (LLM) agents are evolving from request-response assistants into long-running software actors: they maintain state across model calls, fork subtasks, wait for external events, request human authority, generate tools, and perform side effects that must be resumed and audited. This paper presents Agent libOS, a library-OS-inspired runtime substrate for LLM agents. Agent libOS runs above a conventional host operating system; it does not implement hardware drivers, kernel-mode isolation, or a POSIX-compatible operating system. Instead, it treats an agent as an AgentProcess: a schedulable execution subject with process identity, parent-child lineage, lifecycle state, a tool table derived from an AgentImage, typed Object Memory, explicit capabilities, human queues, checkpoints, events, and audit records. Its central design rule is tools are libc-like wrappers; runtime primitives are the authority boundary. Filesystem access, object access, sleeps, human approval, JIT tool registration, and external side effects are checked at primitive boundaries under explicit capabilities and policy. We describe the design, threat model, Python prototype, and safety-oriented evaluation. The current prototype implements async scheduling, namespace-local Object Memory, runtime-integrated human approval, one-shot permission grants, per-process working directories, shell and image-registration primitives, Deno/TypeScript JIT tools over a libOS syscall broker, filesystem/object bridge tools, an injectable Resource Provider Substrate, deterministic demos, real-model smoke scripts, and 123 regression tests at the time of writing. Rather than improving planner accuracy, Agent libOS demonstrates a runtime substrate in which long-running LLM agents can be scheduled, authorized, resumed, and audited without treating tool dispatch as the trust boundary.