CowPilot: 自律型および人間-エージェント協調Webナビゲーションのためのフレームワーク

要旨

ウェブエージェントに関する多くの研究は、ユーザーの代わりに自律的にタスクを実行する可能性を強調していますが、実際には、エージェントは現実世界の複雑なタスクやユーザーの選好モデリングにおいてしばしば不十分です。これは、人間がエージェントと協力し、エージェントの能力を効果的に活用する機会を提供しています。私たちは、CowPilotというフレームワークを提案します。このフレームワークは、自律的および人間とエージェントの協力によるウェブナビゲーションをサポートし、タスクの成功と効率性を評価します。CowPilotは、エージェントが次のステップを提案することで、人間が実行する必要があるステップ数を減らすことができます。ユーザーは、一時停止したり、拒否したり、代替のアクションを取ることができます。実行中、ユーザーは、提案を上書きしたり、必要に応じてエージェントの制御を再開したりすることで、自分のアクションをエージェントと交互に行うことができます。私たちは、5つの一般的なウェブサイトについてケーススタディを実施し、人間とエージェントの協力モードが最高の成功率95%を達成し、人間に総ステップ数の15.2%しか実行させないことがわかりました。タスク実行中に人間が介入しても、エージェントは半分の成功率を達成できます。CowPilotは、ウェブサイト全体でのデータ収集やエージェントの評価に役立つツールとして機能し、ユーザーとエージェントがどのように協力できるかに関する研究を可能にすると考えています。ビデオデモは、https://oaishi.github.io/cowpilot.html でご覧いただけます。

English

While much work on web agents emphasizes the promise of autonomously performing tasks on behalf of users, in reality, agents often fall short on complex tasks in real-world contexts and modeling user preference. This presents an opportunity for humans to collaborate with the agent and leverage the agent's capabilities effectively. We propose CowPilot, a framework supporting autonomous as well as human-agent collaborative web navigation, and evaluation across task success and task efficiency. CowPilot reduces the number of steps humans need to perform by allowing agents to propose next steps, while users are able to pause, reject, or take alternative actions. During execution, users can interleave their actions with the agent by overriding suggestions or resuming agent control when needed. We conducted case studies on five common websites and found that the human-agent collaborative mode achieves the highest success rate of 95% while requiring humans to perform only 15.2% of the total steps. Even with human interventions during task execution, the agent successfully drives up to half of task success on its own. CowPilot can serve as a useful tool for data collection and agent evaluation across websites, which we believe will enable research in how users and agents can work together. Video demonstrations are available at https://oaishi.github.io/cowpilot.html