AgentSPEX：智慧體規範與執行語言

摘要

語言模型代理系統通常依賴反應式提示技術，即通過單一指令引導模型執行開放式的推理與工具使用序列。這種方式將控制流和中間狀態隱式化，可能導致代理行為難以精準控制。雖然如LangGraph、DSPy和CrewAI等編排框架通過明確定義工作流程來增強結構性，但它們將工作流邏輯與Python代碼緊密耦合，使得代理的維護與修改變得困難。本文提出AgentSPEX——一種具備明確控制流與模組化結構的代理規約與執行語言，並配備可定制的代理運行框架。AgentSPEX支持類型化步驟、分支循環、並行執行、可複用子模組及顯式狀態管理，這些工作流可在提供工具調用、沙盒虛擬環境、檢查點、驗證與日誌功能的代理框架中執行。此外，我們開發了具備同步圖形化工作流視覺化編輯器，並內置面向深度研究與科學研究的即用型代理。我們在7項基準測試中評估AgentSPEX，並通過用戶研究證明：相較現有主流代理框架，AgentSPEX提供了更具可解釋性與易用性的工作流編寫範式。

English

Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.

AgentSPEX：智慧體規範與執行語言

AgentSPEX: An Agent SPecification and EXecution Language

摘要

Support