WebDancer:迈向自主信息获取智能体
WebDancer: Towards Autonomous Information Seeking Agency
May 28, 2025
作者: Jialong Wu, Baixuan Li, Runnan Fang, Wenbiao Yin, Liwen Zhang, Zhengwei Tao, Dingchu Zhang, Zekun Xi, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou
cs.AI
摘要
应对复杂的现实世界问题,需要深入的信息检索与多步推理。近期,以深度研究为代表的智能体系统进展,凸显了自主多步研究的潜力。本研究提出了一种从数据中心化和训练阶段视角构建端到端智能信息检索代理的统一范式。我们的方法包含四个关键阶段:(1) 浏览数据构建,(2) 轨迹采样,(3) 用于有效冷启动的监督微调,以及 (4) 强化学习以提升泛化能力。我们在基于ReAct的网页代理WebDancer中实现了这一框架。在GAIA和WebWalkerQA这两个具有挑战性的信息检索基准测试中,WebDancer展现了强劲的性能,取得了显著成果,验证了我们训练范式的有效性。对智能体训练的进一步分析,为开发更强大的智能体模型提供了宝贵的洞见和系统化的可行路径。代码与演示将发布于https://github.com/Alibaba-NLP/WebAgent。
English
Addressing intricate real-world problems necessitates in-depth information
seeking and multi-step reasoning. Recent progress in agentic systems,
exemplified by Deep Research, underscores the potential for autonomous
multi-step research. In this work, we present a cohesive paradigm for building
end-to-end agentic information seeking agents from a data-centric and
training-stage perspective. Our approach consists of four key stages: (1)
browsing data construction, (2) trajectories sampling, (3) supervised
fine-tuning for effective cold start, and (4) reinforcement learning for
enhanced generalisation. We instantiate this framework in a web agent based on
the ReAct, WebDancer. Empirical evaluations on the challenging information
seeking benchmarks, GAIA and WebWalkerQA, demonstrate the strong performance of
WebDancer, achieving considerable results and highlighting the efficacy of our
training paradigm. Further analysis of agent training provides valuable
insights and actionable, systematic pathways for developing more capable
agentic models. The codes and demo will be released in
https://github.com/Alibaba-NLP/WebAgent.Summary
AI-Generated Summary