AgentSPEX: Eine Agenten-Spezifikations- und Ausführungssprache

Zusammenfassung

Agentensysteme auf Basis von Sprachmodellen nutzen häufig reaktives Prompting, bei dem eine einzige Anweisung das Modell durch eine offene Abfolge von Denk- und Werkzeugnutzungsschritten führt. Dabei bleiben Kontrollfluss und Zwischenzustände implizit, was das Agentenverhalten potenziell schwer steuerbar macht. Orchestrierungsframeworks wie LangGraph, DSPy und CrewAI schaffen durch explizite Workflow-Definitionen mehr Struktur, koppeln die Workflow-Logik jedoch eng an Python, was Agenten schwer wart- und modifizierbar macht. In diesem Artikel stellen wir AgentSPEX vor, eine Agenten-Spezifikations- und Ausführungssprache zur Definition von LLM-Agenten-Workflows mit explizitem Kontrollfluss und modularer Struktur, zusammen mit einer anpassbaren Agenten-Laufzeitumgebung. AgentSPEX unterstützt typisierte Schritte, Verzweigungen und Schleifen, parallele Ausführung, wiederverwendbare Submodule und explizites Zustandsmanagement. Diese Workflows werden innerhalb einer Agenten-Laufzeitumgebung ausgeführt, die Werkzeugzugriff, eine sandboxed virtuelle Umgebung sowie Unterstützung für Checkpoints, Verifikation und Protokollierung bietet. Darüber hinaus stellen wir einen visuellen Editor mit synchronisierten Graph- und Workflow-Ansichten zur Erstellung und Inspektion bereit. Wir liefern einsatzbereite Agenten für tiefgehende Recherchen und wissenschaftliche Forschung und evaluieren AgentSPEX anhand von 7 Benchmarks. Abschließend zeigen wir in einer Nutzerstudie, dass AgentSPEX ein verständlicheres und zugänglicheres Paradigma zur Workflow-Erstellung bietet als ein verbreitetes bestehendes Agenten-Framework.

English

Language-model agent systems commonly rely on reactive prompting, in which a single instruction guides the model through an open-ended sequence of reasoning and tool-use steps, leaving control flow and intermediate state implicit and making agent behavior potentially difficult to control. Orchestration frameworks such as LangGraph, DSPy, and CrewAI impose greater structure through explicit workflow definitions, but tightly couple workflow logic with Python, making agents difficult to maintain and modify. In this paper, we introduce AgentSPEX, an Agent SPecification and EXecution Language for specifying LLM-agent workflows with explicit control flow and modular structure, along with a customizable agent harness. AgentSPEX supports typed steps, branching and loops, parallel execution, reusable submodules, and explicit state management, and these workflows execute within an agent harness that provides tool access, a sandboxed virtual environment, and support for checkpointing, verification, and logging. Furthermore, we provide a visual editor with synchronized graph and workflow views for authoring and inspection. We include ready-to-use agents for deep research and scientific research, and we evaluate AgentSPEX on 7 benchmarks. Finally, we show through a user study that AgentSPEX provides a more interpretable and accessible workflow-authoring paradigm than a popular existing agent framework.

AgentSPEX: Eine Agenten-Spezifikations- und Ausführungssprache

AgentSPEX: An Agent SPecification and EXecution Language

Zusammenfassung

Support