AutoWebGLM:基於大型語言模型的網頁導航代理的啟動和強化
AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent
April 4, 2024
作者: Hanyu Lai, Xiao Liu, Iat Long Iong, Shuntian Yao, Yuxuan Chen, Pengbo Shen, Hao Yu, Hanchen Zhang, Xiaohan Zhang, Yuxiao Dong, Jie Tang
cs.AI
摘要
大型語言模型(LLMs)推動了許多智能代理任務,例如網頁導航,但大多數現有代理在真實網頁上的表現遠遠不滿意,原因有三:(1)網頁上的行動多樣性,(2)HTML 文本超過模型處理能力,以及(3)由於網頁的開放域性質,決策複雜性。鑒於這一挑戰,我們開發了AutoWebGLM,這是一個基於ChatGLM3-6B構建的GPT-4表現優越的自動網頁導航代理。受人類瀏覽模式的啟發,我們設計了一個HTML簡化算法來呈現網頁,簡潔地保留重要信息。我們採用混合人工智能方法來構建用於課程訓練的網頁瀏覽數據。然後,我們通過強化學習和拒絕抽樣來啟動模型,進一步促進網頁理解、瀏覽器操作以及有效的任務分解。為了測試,我們建立了一個雙語基準測試AutoWebBench,用於真實世界的網頁瀏覽任務。我們在各種網頁導航基準測試中評估了AutoWebGLM,揭示了它的改進,但也揭示了應對真實環境的潛在挑戰。相關代碼、模型和數據將在https://github.com/THUDM/AutoWebGLM 上發布。
English
Large language models (LLMs) have fueled many intelligent agent tasks, such
as web navigation -- but most existing agents perform far from satisfying in
real-world webpages due to three factors: (1) the versatility of actions on
webpages, (2) HTML text exceeding model processing capacity, and (3) the
complexity of decision-making due to the open-domain nature of web. In light of
the challenge, we develop AutoWebGLM, a GPT-4-outperforming automated web
navigation agent built upon ChatGLM3-6B. Inspired by human browsing patterns,
we design an HTML simplification algorithm to represent webpages, preserving
vital information succinctly. We employ a hybrid human-AI method to build web
browsing data for curriculum training. Then, we bootstrap the model by
reinforcement learning and rejection sampling to further facilitate webpage
comprehension, browser operations, and efficient task decomposition by itself.
For testing, we establish a bilingual benchmark -- AutoWebBench -- for
real-world web browsing tasks. We evaluate AutoWebGLM across diverse web
navigation benchmarks, revealing its improvements but also underlying
challenges to tackle real environments. Related code, model, and data will be
released at https://github.com/THUDM/AutoWebGLM.Summary
AI-Generated Summary