AgentSPEX : Un langage de SPécification et d'EXécution d'Agents

Résumé

Les systèmes d'agents basés sur modèles de langage reposent couramment sur l'invitation réactive, où une instruction unique guide le modèle à travers une séquence ouverte d'étapes de raisonnement et d'utilisation d'outils, laissant le flux de contrôle et l'état intermédiaire implicites et rendant le comportement de l'agent potentiellement difficile à contrôler. Les cadres d'orchestration tels que LangGraph, DSPy et CrewAI imposent une structure plus rigoureuse via des définitions explicites de flux de travail, mais couplent étroitement la logique du workflow avec Python, rendant les agents difficiles à maintenir et à modifier. Dans cet article, nous présentons AgentSPEX, un langage de SPécification et d'EXécution d'Agents pour spécifier des workflows d'agents LLM avec un flux de contrôle explicite et une structure modulaire, ainsi qu'un environnement d'exécution personnalisable. AgentSPEX prend en charge des étapes typées, des branchements et des boucles, une exécution parallèle, des sous-modules réutilisables et une gestion explicite de l'état. Ces workflows s'exécutent au sein d'un environnement qui fournit l'accès aux outils, un environnement virtuel sandboxé, ainsi que la prise en charge de points de contrôle, de la vérification et de la journalisation. De plus, nous fournissons un éditeur visuel avec des vues synchronisées de graphe et de workflow pour la création et l'inspection. Nous incluons des agents prêts à l'emploi pour la recherche approfondie et la recherche scientifique, et nous évaluons AgentSPEX sur 7 benchmarks. Enfin, nous montrons grâce à une étude utilisateur qu'AgentSPEX offre un paradigme de création de workflows plus interprétable et accessible qu'un cadre d'agent populaire existant.

English

Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.

AgentSPEX : Un langage de SPécification et d'EXécution d'Agents

AgentSPEX: An Agent SPecification and EXecution Language

Résumé

Support