RoboPocket:用手机即时优化机器人策略
RoboPocket: Improve Robot Policies Instantly with Your Phone
March 5, 2026
作者: Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu
cs.AI
摘要
模仿学习的规模化从根本上受限于数据收集效率。虽然手持式界面已成为野外数据采集的可扩展解决方案,但其主要以开环方式运行:操作者在不知晓底层策略弱点的情况下盲目收集演示数据,导致对关键状态分布的覆盖效率低下。相比之下,DAgger等交互式方法虽能有效解决协变量偏移问题,却依赖实体机器人执行,成本高昂且难以规模化。为平衡这一矛盾,我们推出RoboPocket——基于单部消费级智能手机即可实现无机器人即时策略迭代的便携系统。其核心创新在于通过增强现实轨迹预测实现远程推理框架,该系统能通过AR视觉预见可视化策略的预测轨迹。这种沉浸式反馈使收集者能主动识别潜在失败点,并将数据收集聚焦于策略薄弱区域,无需实体机器人参与。此外,我们构建了异步在线微调管道,可持续利用输入数据更新策略,在数分钟内实现学习闭环。大量实验表明,RoboPocket遵循数据缩放定律,相比离线缩放策略将数据效率提升一倍,突破了长期存在的效率瓶颈。更值得注意的是,在分布式环境中,我们的即时迭代循环仅需每人少量交互修正即可将样本效率提升高达2倍。项目页面与视频:https://robo-pocket.github.io。
English
Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2times in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.