AWorld:構建智能代理AI的訓練配方
AWorld: Orchestrating the Training Recipe for Agentic AI
August 28, 2025
作者: Chengyue Yu, Siyuan Lu, Chenyi Zhuang, Dong Wang, Qintong Wu, Zongyue Li, Runsheng Gan, Chunfeng Wang, Siqi Hou, Gaochi Huang, Wenlong Yan, Lifeng Hong, Aohui Xue, Yanfeng Wang, Jinjie Gu, David Tsai, Tao Lin
cs.AI
摘要
從實踐中學習的範式對於開發具備能力的自主AI系統至關重要,然而,這一過程卻因經驗生成效率低下而嚴重受阻,這一瓶頸在GAIA等複雜基準測試中尤為明顯。為解決這一問題,我們推出了AWorld,這是一個專為大規模智能體與環境交互而設計的開源系統。通過將任務分佈在集群中執行,AWorld相比標準的單節點順序執行方式,將經驗收集速度提升了14.6倍。這一關鍵的加速使得大規模強化學習變得切實可行且可擴展。利用這一能力,我們訓練了一個基於Qwen3-32B的智能體,其表現顯著超越了基礎模型,在GAIA上的總體準確率從21.59%提升至32.23%。在該基準測試最具挑戰性的級別上,我們的智能體取得了16.33%的成績,超越了領先的專有模型。我們的開源系統及其產生的智能體,為從高效交互到可證明的模型改進的完整自主AI訓練流程提供了一個實用的藍圖。
English
The learning from practice paradigm is crucial for developing capable Agentic
AI systems, yet it is severely hampered by inefficient experience generation, a
bottleneck especially pronounced in complex benchmarks like GAIA. To address
this, we introduce AWorld, an open-source system engineered for large-scale
agent-environment interaction. By distributing tasks across a cluster, AWorld
accelerates experience collection by 14.6x compared to standard single-node,
sequential execution. This critical speedup makes extensive reinforcement
learning practical and scalable. Leveraging this capability, we trained a
Qwen3-32B-based agent that significantly outperforms its base model, increasing
its overall GAIA accuracy from 21.59% to 32.23%. On the benchmark's most
challenging levels, our agent achieves a score of 16.33%, surpassing the
performance of leading proprietary models. Our open-source system and resulting
agent provide a practical blueprint for a complete agentic AI training
pipeline, from efficient interaction to demonstrable model improvement.