AWorld: エージェンシックAIのためのトレーニングレシピのオーケストレーション

要旨

実践からの学習パラダイムは、有能なエージェンシーAIシステムを開発する上で極めて重要であるが、非効率的な経験生成によって大きく妨げられており、GAIAのような複雑なベンチマークでは特にそのボトルネックが顕著である。この問題に対処するため、我々は大規模なエージェント-環境相互作用を目的としたオープンソースシステム「AWorld」を導入した。クラスタ全体にタスクを分散させることで、AWorldは標準的なシングルノードの逐次実行と比較して、経験収集を14.6倍高速化する。この重要な高速化により、大規模な強化学習が実用的かつスケーラブルになる。この能力を活用し、我々はQwen3-32Bベースのエージェントを訓練し、ベースモデルを大幅に上回る性能を達成した。GAIAの総合精度は21.59%から32.23%に向上し、ベンチマークの最も困難なレベルでは16.33%のスコアを達成し、主要なプロプライエタリモデルの性能を凌駕した。我々のオープンソースシステムとその結果得られたエージェントは、効率的な相互作用から実証可能なモデル改善に至る、完全なエージェンシーAIトレーニングパイプラインの実用的な青写真を提供する。

English

The learning from practice paradigm is crucial for developing capable Agentic AI systems, yet it is severely hampered by inefficient experience generation, a bottleneck especially pronounced in complex benchmarks like GAIA. To address this, we introduce AWorld, an open-source system engineered for large-scale agent-environment interaction. By distributing tasks across a cluster, AWorld accelerates experience collection by 14.6x compared to standard single-node, sequential execution. This critical speedup makes extensive reinforcement learning practical and scalable. Leveraging this capability, we trained a Qwen3-32B-based agent that significantly outperforms its base model, increasing its overall GAIA accuracy from 21.59% to 32.23%. On the benchmark's most challenging levels, our agent achieves a score of 16.33%, surpassing the performance of leading proprietary models. Our open-source system and resulting agent provide a practical blueprint for a complete agentic AI training pipeline, from efficient interaction to demonstrable model improvement.

AWorld: エージェンシックAIのためのトレーニングレシピのオーケストレーション

AWorld: Orchestrating the Training Recipe for Agentic AI

要旨

Support