ChatPaper.aiChatPaper

適應性網路代理的合成監督優化

Adapting Web Agents with Synthetic Supervision

November 8, 2025
作者: Zhaoyang Wang, Yiming Liang, Xuchao Zhang, Qianhui Wu, Siwei Han, Anson Bastos, Rujia Wang, Chetan Bansal, Baolin Peng, Jianfeng Gao, Saravan Rajmohan, Huaxiu Yao
cs.AI

摘要

網路代理程式因缺乏環境特定任務與示範資料而難以適應新網站。近期研究嘗試透過合成資料生成來解決此問題,但存在資料品質缺陷:合成任務常包含無法執行的虛幻操作,且收集的軌跡資料充滿冗餘或動作失準的雜訊。本文提出SynthAgent——一個透過雙重細化任務與軌跡來提升合成資料品質的全合成監督框架。我們的方法首先透過分類探索網頁元素來生成多樣化任務,確保對目標環境的有效覆蓋。在軌跡收集過程中,當檢測到任務與實際觀察值衝突時,我們會即時細化任務,在保持任務一致性的同時消除虛幻操作。收集完成後,我們基於全域上下文進行軌跡細化,以減少潛在雜訊與失準問題。最後,我們使用精煉後的合成資料對開源網路代理程式進行微調,使其適應目標環境。實驗結果表明,SynthAgent優於現有合成資料方法,驗證了高品質合成監督的重要性。程式碼將公開於 https://github.com/aiming-lab/SynthAgent。
English
Web agents struggle to adapt to new websites due to the scarcity of environment specific tasks and demonstrations. Recent works have explored synthetic data generation to address this challenge, however, they suffer from data quality issues where synthesized tasks contain hallucinations that cannot be executed, and collected trajectories are noisy with redundant or misaligned actions. In this paper, we propose SynthAgent, a fully synthetic supervision framework that aims at improving synthetic data quality via dual refinement of both tasks and trajectories. Our approach begins by synthesizing diverse tasks through categorized exploration of web elements, ensuring efficient coverage of the target environment. During trajectory collection, we refine tasks when conflicts with actual observations are detected, mitigating hallucinations while maintaining task consistency. After collection, we conduct trajectory refinement with a global context to mitigate potential noise or misalignments. Finally, we fine-tune open-source web agents on the refined synthetic data to adapt them to the target environment. Experimental results demonstrate that SynthAgent outperforms existing synthetic data methods, validating the importance of high-quality synthetic supervision. The code will be publicly available at https://github.com/aiming-lab/SynthAgent.
PDF62December 1, 2025