機器人桌球:一個高速學習系統的案例研究
Robotic Table Tennis: A Case Study into a High Speed Learning System
September 6, 2023
作者: David B. D'Ambrosio, Jonathan Abelian, Saminda Abeyruwan, Michael Ahn, Alex Bewley, Justin Boyd, Krzysztof Choromanski, Omar Cortes, Erwin Coumans, Tianli Ding, Wenbo Gao, Laura Graesser, Atil Iscen, Navdeep Jaitly, Deepali Jain, Juhana Kangaspunta, Satoshi Kataoka, Gus Kouretas, Yuheng Kuang, Nevena Lazic, Corey Lynch, Reza Mahjourian, Sherry Q. Moore, Thinh Nguyen, Ken Oslund, Barney J Reed, Krista Reymann, Pannag R. Sanketi, Anish Shankar, Pierre Sermanet, Vikas Sindhwani, Avi Singh, Vincent Vanhoucke, Grace Vesom, Peng Xu
cs.AI
摘要
我們深入探討一個真實世界的機器人學習系統,先前的研究表明該系統能夠與人類進行數百次乒乓球對打,並具有將球精確返回到指定目標的能力。該系統結合了高度優化的感知子系統、高速低延遲的機器人控制器、一個可以在真實世界中防止損壞並訓練零樣本轉移策略的模擬範式,以及自動重置真實世界環境,實現對物理機器人的自主訓練和評估。我們補充了一個完整的系統描述,包括通常不廣泛傳播的許多設計決策,並附上一系列研究,澄清了緩解各種延遲來源、考慮訓練和部署分布變化、感知系統的穩健性、對策略超參數的敏感性以及行動空間的選擇等重要性。系統組件的演示視頻和實驗結果的詳細信息可在以下網址找到:https://youtu.be/uFcnWjB42I0。
English
We present a deep-dive into a real-world robotic learning system that, in
previous work, was shown to be capable of hundreds of table tennis rallies with
a human and has the ability to precisely return the ball to desired targets.
This system puts together a highly optimized perception subsystem, a high-speed
low-latency robot controller, a simulation paradigm that can prevent damage in
the real world and also train policies for zero-shot transfer, and automated
real world environment resets that enable autonomous training and evaluation on
physical robots. We complement a complete system description, including
numerous design decisions that are typically not widely disseminated, with a
collection of studies that clarify the importance of mitigating various sources
of latency, accounting for training and deployment distribution shifts,
robustness of the perception system, sensitivity to policy hyper-parameters,
and choice of action space. A video demonstrating the components of the system
and details of experimental results can be found at
https://youtu.be/uFcnWjB42I0.