APIGen-MT: シミュレートされたエージェント-人間相互作用によるマルチターンデータ生成のためのエージェント型パイプライン

要旨

マルチターンインタラクションのための効果的なAIエージェントを訓練するには、現実的な人間とエージェントのダイナミクスを捉えた高品質なデータが必要ですが、そのようなデータは希少で、手動で収集するにはコストがかかります。本論文では、検証可能で多様なマルチターンエージェントデータを生成する2段階フレームワーク「APIGen-MT」を紹介します。第1段階では、エージェントパイプラインが、LLMレビュアーの委員会と反復フィードバックループを活用して、グラウンドトゥルースアクションを含む詳細なタスク設計図を作成します。これらの設計図は、シミュレートされた人間とエージェントの相互作用を通じて完全なインタラクショントラジェクトリに変換されます。1Bから70Bパラメータまでのサイズを持つxLAM-2-fc-rシリーズのモデルファミリーを訓練しました。我々のモデルは、tau-benchやBFCLベンチマークにおいて、GPT-4oやClaude 3.5などの最先端モデルを上回り、特にマルチターン設定では、より小さいモデルがより大きなモデルを凌駕し、複数の試行にわたって優れた一貫性を維持しました。包括的な実験により、検証済みの設計図から詳細を生成するアプローチが、高品質な訓練データを提供し、より信頼性が高く効率的で能力のあるエージェントの開発を可能にすることが実証されました。AIエージェント研究の進展のために、収集した合成データと訓練済みのxLAM-2-fc-rモデルをオープンソースとして公開します。モデルはHuggingFace（https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4）で利用可能で、プロジェクトのウェブサイトはhttps://apigen-mt.github.ioです。

English

Training effective AI agents for multi-turn interactions requires high-quality data that captures realistic human-agent dynamics, yet such data is scarce and expensive to collect manually. We introduce APIGen-MT, a two-phase framework that generates verifiable and diverse multi-turn agent data. In the first phase, our agentic pipeline produces detailed task blueprints with ground-truth actions, leveraging a committee of LLM reviewers and iterative feedback loops. These blueprints are then transformed into complete interaction trajectories through simulated human-agent interplay. We train a family of models -- the xLAM-2-fc-r series with sizes ranging from 1B to 70B parameters. Our models outperform frontier models such as GPT-4o and Claude 3.5 on tau-bench and BFCL benchmarks, with the smaller models surpassing their larger counterparts, particularly in multi-turn settings, while maintaining superior consistency across multiple trials. Comprehensive experiments demonstrate that our verified blueprint-to-details approach yields high-quality training data, enabling the development of more reliable, efficient, and capable agents. We open-source both the synthetic data collected and the trained xLAM-2-fc-r models to advance research in AI agents. Models are available on HuggingFace at https://huggingface.co/collections/Salesforce/xlam-2-67ef5be12949d8dcdae354c4 and project website is https://apigen-mt.github.io

APIGen-MT: シミュレートされたエージェント-人間相互作用によるマルチターンデータ生成のためのエージェント型パイプライン

APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

要旨

Support