ChatPaper.aiChatPaper

UFO:一个专注于 Windows 操作系统交互的用户界面代理

UFO: A UI-Focused Agent for Windows OS Interaction

February 8, 2024
作者: Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
cs.AI

摘要

我们介绍了UFO,这是一种创新的面向用户界面的代理,旨在满足针对Windows操作系统应用程序定制的用户请求,利用GPT-Vision的能力。UFO采用双代理框架,精心观察和分析Windows应用程序的图形用户界面(GUI)和控制信息。这使得代理能够无缝地在单个应用程序内部和跨应用程序之间导航和操作,以满足用户请求,即使涉及多个应用程序。该框架包含一个控制交互模块,促进无需人类干预的行动基础,并实现完全自动化执行。因此,UFO将费时费力的过程转变为仅通过自然语言命令就能完成的简单任务。我们在9个流行的Windows应用程序上对UFO进行了测试,涵盖了反映用户日常使用情景的各种场景。从定量指标和实际案例研究得出的结果强调了UFO在满足用户请求方面的卓越效果。据我们所知,UFO是专门为在Windows操作系统环境中完成任务而量身定制的第一个用户界面代理。UFO的开源代码可在https://github.com/microsoft/UFO 上获取。
English
We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a dual-agent framework to meticulously observe and analyze the graphical user interface (GUI) and control information of Windows applications. This enables the agent to seamlessly navigate and operate within individual applications and across them to fulfill user requests, even when spanning multiple applications. The framework incorporates a control interaction module, facilitating action grounding without human intervention and enabling fully automated execution. Consequently, UFO transforms arduous and time-consuming processes into simple tasks achievable solely through natural language commands. We conducted testing of UFO across 9 popular Windows applications, encompassing a variety of scenarios reflective of users' daily usage. The results, derived from both quantitative metrics and real-case studies, underscore the superior effectiveness of UFO in fulfilling user requests. To the best of our knowledge, UFO stands as the first UI agent specifically tailored for task completion within the Windows OS environment. The open-source code for UFO is available on https://github.com/microsoft/UFO.

Summary

AI-Generated Summary

PDF163December 15, 2024