ChatPaper.aiChatPaper

AMEX:用于移动GUI代理的Android多注释博览会数据集

AMEX: Android Multi-annotation Expo Dataset for Mobile GUI Agents

July 3, 2024
作者: Yuxiang Chai, Siyuan Huang, Yazhe Niu, Han Xiao, Liang Liu, Dingyu Zhang, Peng Gao, Shuai Ren, Hongsheng Li
cs.AI

摘要

人工智能代理引起了越来越多的关注,主要是因为它们能够感知环境、理解任务并自主实现目标。为了推动移动场景下人工智能代理的研究,我们引入了Android多注释博览会(AMEX),这是一个专为通用移动GUI控制代理设计的全面大规模数据集。他们通过直接与移动设备上的图形用户界面(GUI)进行交互来完成复杂任务的能力是通过提出的数据集进行训练和评估的。AMEX包括来自110个热门移动应用的超过104K高分辨率截图,这些截图在多个层次上进行了注释。与现有的移动设备控制数据集(例如MoTIF、AitW等)不同,AMEX包括三个级别的注释:GUI交互元素定位、GUI屏幕和元素功能描述以及复杂的自然语言说明,每个平均包含13个步骤,具有逐步的GUI操作链。我们从更具指导性和详细性的角度开发了这一数据集,以补充现有数据集的一般设置。此外,我们开发了一个基准模型SPHINX代理,并比较其在其他数据集上训练的最新代理的性能。为了促进进一步的研究,我们开放源代码我们的数据集、模型和相关评估工具。该项目可在https://yuxiangchai.github.io/AMEX/ 上找到。
English
AI agents have drawn increasing attention mostly on their ability to perceive environments, understand tasks, and autonomously achieve goals. To advance research on AI agents in mobile scenarios, we introduce the Android Multi-annotation EXpo (AMEX), a comprehensive, large-scale dataset designed for generalist mobile GUI-control agents. Their capabilities of completing complex tasks by directly interacting with the graphical user interface (GUI) on mobile devices are trained and evaluated with the proposed dataset. AMEX comprises over 104K high-resolution screenshots from 110 popular mobile applications, which are annotated at multiple levels. Unlike existing mobile device-control datasets, e.g., MoTIF, AitW, etc., AMEX includes three levels of annotations: GUI interactive element grounding, GUI screen and element functionality descriptions, and complex natural language instructions, each averaging 13 steps with stepwise GUI-action chains. We develop this dataset from a more instructive and detailed perspective, complementing the general settings of existing datasets. Additionally, we develop a baseline model SPHINX Agent and compare its performance across state-of-the-art agents trained on other datasets. To facilitate further research, we open-source our dataset, models, and relevant evaluation tools. The project is available at https://yuxiangchai.github.io/AMEX/

Summary

AI-Generated Summary

PDF322November 28, 2024