ChatPaper.aiChatPaper

MobiAgent:一個系統化的可定制移動代理框架

MobiAgent: A Systematic Framework for Customizable Mobile Agents

August 30, 2025
作者: Cheng Zhang, Erhu Feng, Xi Zhao, Yisheng Zhao, Wangbo Gong, Jiahui Sun, Dong Du, Zhichao Hua, Yubin Xia, Haibo Chen
cs.AI

摘要

隨著視覺語言模型(VLMs)的快速發展,基於圖形用戶界面(GUI)的移動代理已成為智能移動系統的一個關鍵發展方向。然而,現有的代理模型在實際任務執行中仍面臨重大挑戰,特別是在準確性和效率方面。為解決這些限制,我們提出了MobiAgent,這是一個全面的移動代理系統,包含三個核心組件:MobiMind系列代理模型、AgentRR加速框架和MobiFlow基準測試套件。此外,考慮到當前移動代理的能力仍受制於高質量數據的可用性,我們開發了一種AI輔助的敏捷數據收集管道,顯著降低了人工註釋的成本。與通用的大型語言模型(LLMs)和專用GUI代理模型相比,MobiAgent在實際移動場景中實現了最先進的性能。
English
With the rapid advancement of Vision-Language Models (VLMs), GUI-based mobile agents have emerged as a key development direction for intelligent mobile systems. However, existing agent models continue to face significant challenges in real-world task execution, particularly in terms of accuracy and efficiency. To address these limitations, we propose MobiAgent, a comprehensive mobile agent system comprising three core components: the MobiMind-series agent models, the AgentRR acceleration framework, and the MobiFlow benchmarking suite. Furthermore, recognizing that the capabilities of current mobile agents are still limited by the availability of high-quality data, we have developed an AI-assisted agile data collection pipeline that significantly reduces the cost of manual annotation. Compared to both general-purpose LLMs and specialized GUI agent models, MobiAgent achieves state-of-the-art performance in real-world mobile scenarios.
PDF51September 3, 2025