RoboPocket:用手機即時提升機器人策略效能
RoboPocket: Improve Robot Policies Instantly with Your Phone
March 5, 2026
作者: Junjie Fang, Wendi Chen, Han Xue, Fangyuan Zhou, Tian Le, Yi Wang, Yuting Zhang, Jun Lv, Chuan Wen, Cewu Lu
cs.AI
摘要
模仿學習的規模化發展從根本上受制於資料收集的效率。雖然手持式介面已成為野外資料獲取的可擴展解決方案,但其主要以開環模式運行:操作者在不知曉底層策略弱點的情況下盲目收集示範資料,導致關鍵狀態分佈的覆蓋效率低下。相比之下,DAgger等互動式方法雖能有效解決協變量偏移問題,卻依賴實體機器人執行,成本高昂且難以擴展。為權衡這一矛盾,我們推出RoboPocket——一款利用單個消費級智慧型手機實現無機器人即時策略迭代的便攜系統。其核心創新在於通過擴增實境視覺預覽技術可視化策略預測軌跡的遠端推理框架,這種沉浸式回饋能讓收集者主動識別潛在失誤,並將資料收集聚焦於策略薄弱區域,無需實體機器人參與。此外,我們設計了非同步線上微調管道,可持續利用輸入資料更新策略,在數分鐘內實現學習閉環。大量實驗表明,RoboPocket遵循資料縮放定律,相較離線縮放策略將資料效率提升一倍,突破了長期存在的效率瓶頸。更值得注意的是,我們的即時迭代迴圈在分散式環境中僅需每人少量互動修正,即可將樣本效率提升最高達2倍。項目頁面與影片:https://robo-pocket.github.io。
English
Scaling imitation learning is fundamentally constrained by the efficiency of data collection. While handheld interfaces have emerged as a scalable solution for in-the-wild data acquisition, they predominantly operate in an open-loop manner: operators blindly collect demonstrations without knowing the underlying policy's weaknesses, leading to inefficient coverage of critical state distributions. Conversely, interactive methods like DAgger effectively address covariate shift but rely on physical robot execution, which is costly and difficult to scale. To reconcile this trade-off, we introduce RoboPocket, a portable system that enables Robot-Free Instant Policy Iteration using single consumer smartphones. Its core innovation is a Remote Inference framework that visualizes the policy's predicted trajectory via Augmented Reality (AR) Visual Foresight. This immersive feedback allows collectors to proactively identify potential failures and focus data collection on the policy's weak regions without requiring a physical robot. Furthermore, we implement an asynchronous Online Finetuning pipeline that continuously updates the policy with incoming data, effectively closing the learning loop in minutes. Extensive experiments demonstrate that RoboPocket adheres to data scaling laws and doubles the data efficiency compared to offline scaling strategies, overcoming their long-standing efficiency bottleneck. Moreover, our instant iteration loop also boosts sample efficiency by up to 2times in distributed environments a small number of interactive corrections per person. Project page and videos: https://robo-pocket.github.io.