AgentOhana: 効果的なエージェント学習のための統一データ・トレーニングパイプラインの設計

要旨

大規模言語モデル（LLM）を基盤とした自律エージェントは、大きな研究関心を集めています。しかし、多様なデータソースにまたがるマルチターン軌跡の異質性により、エージェントベースのタスクにおいてLLMの潜在能力を十分に活用することには固有の課題があります。本論文では、これらの課題に対処する包括的なソリューションとしてAgentOhanaを紹介します。AgentOhanaは、さまざまなシナリオにわたる異なる環境からのエージェント軌跡を集約し、これらの軌跡を注意深く標準化して統一された形式に変換します。これにより、エージェントトレーニングに最適化された汎用データローダーの作成が効率化されます。データの統一を活用することで、私たちのトレーニングパイプラインは異なるデータソース間の均衡を維持し、データセットの分割やモデルトレーニング中にデバイス間で独立したランダム性を保持します。さらに、AIエージェント向けに設計された大規模アクションモデルxLAM-v0.1を提示し、これはさまざまなベンチマークで卓越した性能を示しています。

English

Autonomous agents powered by large language models (LLMs) have garnered significant research attention. However, fully harnessing the potential of LLMs for agent-based tasks presents inherent challenges due to the heterogeneous nature of diverse data sources featuring multi-turn trajectories. In this paper, we introduce AgentOhana as a comprehensive solution to address these challenges. AgentOhana aggregates agent trajectories from distinct environments, spanning a wide array of scenarios. It meticulously standardizes and unifies these trajectories into a consistent format, streamlining the creation of a generic data loader optimized for agent training. Leveraging the data unification, our training pipeline maintains equilibrium across different data sources and preserves independent randomness across devices during dataset partitioning and model training. Additionally, we present xLAM-v0.1, a large action model tailored for AI agents, which demonstrates exceptional performance across various benchmarks.

AgentOhana: 効果的なエージェント学習のための統一データ・トレーニングパイプラインの設計

AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning

要旨

Support