AgentGym：在多樣環境中演化基於大型語言模型的智能體

摘要

在人工智慧社群中，建立能夠處理多樣任務並在不同環境中自我進化的通用代理是一個長期目標。大型語言模型（LLMs）被認為是建立此類代理的有前途基礎，因為它們具有廣泛的能力。目前的方法要麼讓基於LLM的代理模仿專家提供的軌跡一步一步進行，需要人類監督，這很難擴展並限制了環境探索；要麼讓代理在孤立環境中探索和學習，導致具有有限泛化能力的專家代理。在本文中，我們邁出了建立具有自我進化能力的通用LLM代理的第一步。我們確定了三個要素：1）為代理探索和學習提供多樣環境，2）一組軌跡以裝備代理基本能力和先前知識，以及3）一種有效且可擴展的進化方法。我們提出了AgentGym，一個新框架，具有各種環境和任務，用於廣泛、實時、統一格式和並行代理探索。AgentGym還包括一個擴展指令的數據庫、一個基準套件以及跨環境的高質量軌跡。接下來，我們提出了一種新方法AgentEvol，來探討代理在任務和環境之間超越先前見過數據的自我進化潛力。實驗結果顯示，進化的代理可以達到與SOTA模型相當的結果。我們釋出了AgentGym套件，包括平台、數據集、基準、檢查點和算法實現。AgentGym套件可在https://github.com/WooooDyy/AgentGym 上獲得。

English

Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervision, which is hard to scale and limits environmental exploration; or they let agents explore and learn in isolated environments, resulting in specialist agents with limited generalization. In this paper, we take the first step towards building generally-capable LLM-based agents with self-evolution ability. We identify a trinity of ingredients: 1) diverse environments for agent exploration and learning, 2) a trajectory set to equip agents with basic capabilities and prior knowledge, and 3) an effective and scalable evolution method. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration. AgentGym also includes a database with expanded instructions, a benchmark suite, and high-quality trajectories across environments. Next, we propose a novel method, AgentEvol, to investigate the potential of agent self-evolution beyond previously seen data across tasks and environments. Experimental results show that the evolved agents can achieve results comparable to SOTA models. We release the AgentGym suite, including the platform, dataset, benchmark, checkpoints, and algorithm implementations. The AgentGym suite is available on https://github.com/WooooDyy/AgentGym.

AgentGym：在多樣環境中演化基於大型語言模型的智能體

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

摘要

Support