ChatPaper.aiChatPaper

UFO:針對 Windows 作業系統互動的使用者介面專注代理程式

UFO: A UI-Focused Agent for Windows OS Interaction

February 8, 2024
作者: Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang
cs.AI

摘要

我們介紹了UFO,一個創新的以UI為焦點的代理程式,旨在滿足針對Windows作業系統應用程式定製的使用者請求,利用GPT-Vision的能力。UFO採用雙代理架構,精心觀察和分析Windows應用程式的圖形使用者介面(GUI)和控制資訊。這使得代理程式能夠無縫地在個別應用程式內部和跨應用程式之間進行導航和操作,以滿足使用者的請求,即使涉及多個應用程式。該架構包含一個控制互動模組,促進行動基礎的實現而無需人類干預,實現完全自動化執行。因此,UFO將費時費力的過程轉變為僅通過自然語言命令就能輕鬆完成的簡單任務。我們在9個流行的Windows應用程式上對UFO進行了測試,涵蓋了反映使用者日常使用情況的各種情境。從定量指標和實際案例研究中得出的結果突顯了UFO在滿足使用者請求方面的卓越效果。據我們所知,UFO是第一個專門為Windows作業系統環境中的任務完成而定製的UI代理程式。UFO的開源程式碼可在https://github.com/microsoft/UFO 上找到。
English
We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a dual-agent framework to meticulously observe and analyze the graphical user interface (GUI) and control information of Windows applications. This enables the agent to seamlessly navigate and operate within individual applications and across them to fulfill user requests, even when spanning multiple applications. The framework incorporates a control interaction module, facilitating action grounding without human intervention and enabling fully automated execution. Consequently, UFO transforms arduous and time-consuming processes into simple tasks achievable solely through natural language commands. We conducted testing of UFO across 9 popular Windows applications, encompassing a variety of scenarios reflective of users' daily usage. The results, derived from both quantitative metrics and real-case studies, underscore the superior effectiveness of UFO in fulfilling user requests. To the best of our knowledge, UFO stands as the first UI agent specifically tailored for task completion within the Windows OS environment. The open-source code for UFO is available on https://github.com/microsoft/UFO.

Summary

AI-Generated Summary

PDF163December 15, 2024