MobileEgo Anywhere:面向商品硬件的长视界自我中心数据的开放基础设施
MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware
May 7, 2026
作者: Senthil Palanisamy, Abhishek Anand, Satpal Singh Rathor, Pratyush Patnaik, Shubhanshu Khatana
cs.AI
摘要
近年来,视觉-语言-动作(VLA)模型的快速发展催生了对大规模自我中心数据集的迫切需求。然而,现有数据集通常受限于短时长的片段——往往仅持续数分钟——无法捕捉复杂机器人任务执行所需的长程时间依赖关系。为填补这一空白,我们提出了MobileEgo Anywhere框架,该框架旨在利用商用移动硬件,便捷地收集鲁棒的、时长超过一小时的自我中心轨迹。我们借助现代智能手机普遍配备的传感器套件,实现高保真、长时段的相机位姿跟踪,有效消除了传统机器人数据采集的高硬件门槛。我们的贡献体现在三方面:(1)发布了一个包含200小时多样、长程自我中心数据的新数据集,并实现持久化状态追踪;(2)开源了一款移动应用,使任何用户都能录制自我中心数据;(3)提供了一套完整的处理流程,将原始移动端采集数据转化为标准化、可训练格式,用于视觉-语言-动作模型及基础模型研究。通过将数据采集过程民主化,这项工作实现了跨多样化全球环境下大规模长程数据的获取,从而加速了通用化机器人策略的开发。
English
The recent advancement of Vision Language Action (VLA) models has driven a critical demand for large scale egocentric datasets. However, existing datasets are often limited by short episode durations, typically spanning only a few minutes, which fails to capture the long horizon temporal dependencies necessary for complex robotic task execution. To bridge this gap, we present MobileEgo Anywhere, a framework designed to facilitate the collection of robust, hour plus egocentric trajectories using commodity mobile hardware. We leverage the ubiquitous sensor suites of modern smartphones to provide high fidelity, long term camera pose tracking, effectively removing the high hardware barriers associated with traditional robotics data collection. Our contributions are three fold: (1) we release a novel dataset comprising 200 hours of diverse, long form egocentric data with persistent state tracking; (2) we open source a mobile application that enables any user to record egocentric data, and (3) we provide a comprehensive processing pipeline to convert raw mobile captures into standardized, training ready formats for Vision Language Action model and foundation model research. By democratizing the data collection process, this work enables the massive scale acquisition of long horizon data across varied global environments, accelerating the development of generalizable robotic policies.