CowPilot:自主和人-智能體協作網頁導航框架
CowPilot: A Framework for Autonomous and Human-Agent Collaborative Web Navigation
January 28, 2025
作者: Faria Huq, Zora Zhiruo Wang, Frank F. Xu, Tianyue Ou, Shuyan Zhou, Jeffrey P. Bigham, Graham Neubig
cs.AI
摘要
儘管許多關於網路代理的研究強調了代理人自主執行任務的潛力,但實際上,在現實世界的情境中,代理人往往無法完成複雜任務和建模使用者偏好。這為人類與代理人合作並有效利用代理人能力提供了機會。我們提出了CowPilot,一個支持自主以及人-代理人協作網路導航的框架,並跨任務成功和任務效率進行評估。CowPilot通過允許代理人提出下一步驟來減少人類需要執行的步驟數,同時用戶可以暫停、拒絕或採取替代行動。在執行過程中,用戶可以通過覆蓋建議或在需要時恢復代理人控制來交錯他們的行動與代理人的行動。我們對五個常見網站進行了案例研究,發現人-代理人協作模式實現了95%的最高成功率,只需要人類執行總步驟的15.2%。即使在任務執行過程中有人類干預,代理人也能獨立成功推動高達一半的任務成功。CowPilot可以作為跨網站數據收集和代理人評估的有用工具,我們相信這將促進關於使用者和代理人如何共同工作的研究。視頻演示可在https://oaishi.github.io/cowpilot.html中找到。
English
While much work on web agents emphasizes the promise of autonomously
performing tasks on behalf of users, in reality, agents often fall short on
complex tasks in real-world contexts and modeling user preference. This
presents an opportunity for humans to collaborate with the agent and leverage
the agent's capabilities effectively. We propose CowPilot, a framework
supporting autonomous as well as human-agent collaborative web navigation, and
evaluation across task success and task efficiency. CowPilot reduces the number
of steps humans need to perform by allowing agents to propose next steps, while
users are able to pause, reject, or take alternative actions. During execution,
users can interleave their actions with the agent by overriding suggestions or
resuming agent control when needed. We conducted case studies on five common
websites and found that the human-agent collaborative mode achieves the highest
success rate of 95% while requiring humans to perform only 15.2% of the total
steps. Even with human interventions during task execution, the agent
successfully drives up to half of task success on its own. CowPilot can serve
as a useful tool for data collection and agent evaluation across websites,
which we believe will enable research in how users and agents can work
together. Video demonstrations are available at
https://oaishi.github.io/cowpilot.htmlSummary
AI-Generated Summary