ChatPaper.aiChatPaper

野外的安卓:一個用於控制安卓裝置的大規模數據集

Android in the Wild: A Large-Scale Dataset for Android Device Control

July 19, 2023
作者: Christopher Rawles, Alice Li, Daniel Rodriguez, Oriana Riva, Timothy Lillicrap
cs.AI

摘要

近來對能夠解釋人類自然語言指令並通過直接控制數位設備的使用者介面來執行這些指令的設備控制系統越來越感興趣。我們提出了一個用於設備控制研究的數據集 Android in the Wild (AITW),其規模比目前的數據集大許多。該數據集包含了人類對設備互動的示範,包括屏幕和操作,以及相應的自然語言指令。它包含了 715k 個情節,涵蓋 30k 個獨特指令,四個 Android 版本 (v10-13),以及八種設備類型 (Pixel 2 XL 到 Pixel 6),具有不同的屏幕分辨率。它包含需要語義理解和視覺上下文的多步任務。這個數據集提出了一個新挑戰:必須從其視覺外觀推斷使用者介面中可用的操作。而且,動作空間不是基於簡單的使用者介面元素的操作,而是包含精確手勢 (例如,水平滾動以操作旋轉木馬小部件)。我們組織了我們的數據集,以鼓勵對設備控制系統的韌性分析,即系統在面對新任務描述、新應用程序或新平台版本時的表現。我們開發了兩個代理程序並報告了在整個數據集上的性能。該數據集可在以下網址獲得:https://github.com/google-research/google-research/tree/master/android_in_the_wild。
English
There is a growing interest in device-control systems that can interpret human natural language instructions and execute them on a digital device by directly controlling its user interface. We present a dataset for device-control research, Android in the Wild (AITW), which is orders of magnitude larger than current datasets. The dataset contains human demonstrations of device interactions, including the screens and actions, and corresponding natural language instructions. It consists of 715k episodes spanning 30k unique instructions, four versions of Android (v10-13),and eight device types (Pixel 2 XL to Pixel 6) with varying screen resolutions. It contains multi-step tasks that require semantic understanding of language and visual context. This dataset poses a new challenge: actions available through the user interface must be inferred from their visual appearance. And, instead of simple UI element-based actions, the action space consists of precise gestures (e.g., horizontal scrolls to operate carousel widgets). We organize our dataset to encourage robustness analysis of device-control systems, i.e., how well a system performs in the presence of new task descriptions, new applications, or new platform versions. We develop two agents and report performance across the dataset. The dataset is available at https://github.com/google-research/google-research/tree/master/android_in_the_wild.
PDF111December 15, 2024