AutoWebGLM:基于大型语言模型的Web导航代理的自举和强化
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
April 4, 2024
作者: Hanyu Lai, Xiao Liu, Iat Long Iong, Shuntian Yao, Yuxuan Chen, Pengbo Shen, Hao Yu, Hanchen Zhang, Xiaohan Zhang, Yuxiao Dong, Jie Tang
cs.AI
摘要
大型语言模型(LLMs)推动了许多智能代理任务,例如网络导航,但由于三个因素,大多数现有代理在真实网页上的表现远未令人满意:(1)网页上行为的多样性,(2)HTML文本超出模型处理能力,以及(3)由于网络的开放域特性,决策复杂性。针对这一挑战,我们开发了AutoWebGLM,这是一个基于ChatGLM3-6B构建的GPT-4性能优越的自动化网络导航代理。受人类浏览模式启发,我们设计了一种HTML简化算法来简洁地表示网页,保留重要信息。我们采用混合人工智能方法构建网页浏览数据以进行课程训练。然后,我们通过强化学习和拒绝抽样来引导模型,进一步促进网页理解、浏览器操作以及有效的任务分解。为了测试,我们建立了一个双语基准——AutoWebBench,用于真实世界的网络浏览任务。我们评估了AutoWebGLM在各种网络导航基准上的表现,揭示了其改进之处,但也揭示了需要解决真实环境中的潜在挑战。相关代码、模型和数据将在https://github.com/THUDM/AutoWebGLM 上发布。
English
Large language models (LLMs) have fueled many intelligent agent tasks, such
as web navigation -- but most existing agents perform far from satisfying in
real-world webpages due to three factors: (1) the versatility of actions on
webpages, (2) HTML text exceeding model processing capacity, and (3) the
complexity of decision-making due to the open-domain nature of web. In light of
the challenge, we develop AutoWebGLM, a GPT-4-outperforming automated web
navigation agent built upon ChatGLM3-6B. Inspired by human browsing patterns,
we design an HTML simplification algorithm to represent webpages, preserving
vital information succinctly. We employ a hybrid human-AI method to build web
browsing data for curriculum training. Then, we bootstrap the model by
reinforcement learning and rejection sampling to further facilitate webpage
comprehension, browser operations, and efficient task decomposition by itself.
For testing, we establish a bilingual benchmark -- AutoWebBench -- for
real-world web browsing tasks. We evaluate AutoWebGLM across diverse web
navigation benchmarks, revealing its improvements but also underlying
challenges to tackle real environments. Related code, model, and data will be
released at https://github.com/THUDM/AutoWebGLM.Summary
AI-Generated Summary