AgentGym: 多様な環境における大規模言語モデルベースのエージェントの進化

要旨

多様なタスクを処理し、異なる環境間で自己進化できる汎用エージェントの構築は、AIコミュニティにおける長期的な目標である。大規模言語モデル（LLMs）は、その汎用的な能力から、そのようなエージェントを構築するための有望な基盤と見なされている。現在のアプローチでは、LLMベースのエージェントに専門家が提供した軌跡をステップバイステップで模倣させることで、人間の監督を必要とし、スケーラビリティが低く、環境探索が制限されるか、またはエージェントを孤立した環境で探索・学習させ、汎化能力が限られた専門家エージェントを生み出している。本論文では、自己進化能力を持つ汎用LLMベースエージェントの構築に向けた第一歩を踏み出す。我々は、以下の3つの要素を特定した：1）エージェントの探索と学習のための多様な環境、2）エージェントに基本的な能力と事前知識を提供する軌跡セット、3）効果的でスケーラブルな進化手法。我々は、広範でリアルタイム、統一フォーマット、並行探索を可能にする多様な環境とタスクを特徴とする新しいフレームワーク、AgentGymを提案する。AgentGymには、拡張された指示、ベンチマークスイート、および環境間での高品質な軌跡を含むデータベースも含まれる。次に、我々は、タスクや環境を超えて以前に見たデータを超えたエージェントの自己進化の可能性を探るための新しい手法、AgentEvolを提案する。実験結果は、進化したエージェントがSOTAモデルに匹敵する結果を達成できることを示している。我々は、プラットフォーム、データセット、ベンチマーク、チェックポイント、およびアルゴリズム実装を含むAgentGymスイートを公開する。AgentGymスイートはhttps://github.com/WooooDyy/AgentGymで利用可能である。

English

Building generalist agents that can handle diverse tasks and evolve themselves across different environments is a long-term goal in the AI community. Large language models (LLMs) are considered a promising foundation to build such agents due to their generalized capabilities. Current approaches either have LLM-based agents imitate expert-provided trajectories step-by-step, requiring human supervision, which is hard to scale and limits environmental exploration; or they let agents explore and learn in isolated environments, resulting in specialist agents with limited generalization. In this paper, we take the first step towards building generally-capable LLM-based agents with self-evolution ability. We identify a trinity of ingredients: 1) diverse environments for agent exploration and learning, 2) a trajectory set to equip agents with basic capabilities and prior knowledge, and 3) an effective and scalable evolution method. We propose AgentGym, a new framework featuring a variety of environments and tasks for broad, real-time, uni-format, and concurrent agent exploration. AgentGym also includes a database with expanded instructions, a benchmark suite, and high-quality trajectories across environments. Next, we propose a novel method, AgentEvol, to investigate the potential of agent self-evolution beyond previously seen data across tasks and environments. Experimental results show that the evolved agents can achieve results comparable to SOTA models. We release the AgentGym suite, including the platform, dataset, benchmark, checkpoints, and algorithm implementations. The AgentGym suite is available on https://github.com/WooooDyy/AgentGym.

AgentGym: 多様な環境における大規模言語モデルベースのエージェントの進化

AgentGym: Evolving Large Language Model-based Agents across Diverse Environments

要旨

Support