AgentGym：在多样化环境中演化基于大型语言模型的智能体

摘要

在人工智能领域，构建能够处理多样任务并在不同环境中自我进化的通用智能体是一个长期目标。大型语言模型（LLMs）被认为是构建这类智能体的有前途的基础，因为它们具有广义能力。目前的方法要么让基于LLM的智能体逐步模仿专家提供的轨迹，需要人类监督，这很难扩展并限制了环境探索；要么让智能体在孤立环境中探索和学习，导致专家智能体的泛化能力有限。本文迈出了建立具备自我进化能力的通用LLM智能体的第一步。我们确定了三个要素：1）多样化环境用于智能体探索和学习，2）轨迹集以赋予智能体基本能力和先验知识，3）有效且可扩展的进化方法。我们提出了AgentGym，一个新框架，具备多样环境和任务，用于广泛、实时、统一格式和并发的智能体探索。AgentGym还包括一个扩展指南数据库、基准套件和跨环境的高质量轨迹。接着，我们提出了一种新方法AgentEvol，探讨智能体在任务和环境中超越先前见过数据的自我进化潜力。实验结果表明，进化后的智能体能够达到与SOTA模型可比的结果。我们发布了AgentGym套件，包括平台、数据集、基准、检查点和算法实现。AgentGym套件可在https://github.com/WooooDyy/AgentGym 上获取。

English

Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervision, which is hard to scale and limits environmental exploration; or they let agents explore and learn in isolated environments, resulting in specialist agents with limited generalization. In this paper, we take the first step towards building generally-capable LLM-based agents with self-evolution ability. We identify a trinity of ingredients: 1) diverse environments for agent exploration and learning, 2) a trajectory set to equip agents with basic capabilities and prior knowledge, and 3) an effective and scalable evolution method. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration. AgentGym also includes a database with expanded instructions, a benchmark suite, and high-quality trajectories across environments. Next, we propose a novel method, AgentEvol, to investigate the potential of agent self-evolution beyond previously seen data across tasks and environments. Experimental results show that the evolved agents can achieve results comparable to SOTA models. We release the AgentGym suite, including the platform, dataset, benchmark, checkpoints, and algorithm implementations. The AgentGym suite is available on https://github.com/WooooDyy/AgentGym.

AgentGym：在多样化环境中演化基于大型语言模型的智能体

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

摘要

Support