MobiAgent:一个可定制移动代理的系统化框架
MobiAgent: A Systematic Framework for Customizable Mobile Agents
August 30, 2025
作者: Cheng Zhang, Erhu Feng, Xi Zhao, Yisheng Zhao, Wangbo Gong, Jiahui Sun, Dong Du, Zhichao Hua, Yubin Xia, Haibo Chen
cs.AI
摘要
随着视觉-语言模型(VLMs)的迅猛发展,基于图形用户界面(GUI)的移动智能体已成为智能移动系统发展的关键方向。然而,现有智能体模型在执行现实任务时仍面临显著挑战,尤其是在准确性和效率方面。为应对这些局限,我们提出了MobiAgent,一个全面的移动智能体系统,包含三大核心组件:MobiMind系列智能体模型、AgentRR加速框架以及MobiFlow基准测试套件。此外,鉴于当前移动智能体的能力仍受限于高质量数据的可获得性,我们开发了一套AI辅助的敏捷数据收集流程,大幅降低了人工标注的成本。与通用大语言模型(LLMs)及专用GUI智能体模型相比,MobiAgent在真实移动场景中实现了业界领先的性能表现。
English
With the rapid advancement of Vision-Language Models (VLMs), GUI-based mobile
agents have emerged as a key development direction for intelligent mobile
systems. However, existing agent models continue to face significant challenges
in real-world task execution, particularly in terms of accuracy and efficiency.
To address these limitations, we propose MobiAgent, a comprehensive mobile
agent system comprising three core components: the MobiMind-series agent
models, the AgentRR acceleration framework, and the MobiFlow benchmarking
suite. Furthermore, recognizing that the capabilities of current mobile agents
are still limited by the availability of high-quality data, we have developed
an AI-assisted agile data collection pipeline that significantly reduces the
cost of manual annotation. Compared to both general-purpose LLMs and
specialized GUI agent models, MobiAgent achieves state-of-the-art performance
in real-world mobile scenarios.