TRANSIC:通過從在線校正中學習實現模擬到現實的策略轉移
TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction
May 16, 2024
作者: Yunfan Jiang, Chen Wang, Ruohan Zhang, Jiajun Wu, Li Fei-Fei
cs.AI
摘要
在模擬環境中學習並將所學政策轉移到現實世界,有潛力實現通用型機器人。這種方法的關鍵挑戰在於解決模擬到現實(sim-to-real)之間的差距。先前的方法通常需要預先具備特定領域的知識。我們認為獲取這種知識的一種直接方式是請人類觀察並協助機器人在現實世界中執行政策。然後機器人可以從人類那裡學習,以消除各種模擬到現實之間的差距。我們提出了TRANSIC,這是一種基於人機協同框架的數據驅動方法,以實現成功的模擬到現實轉移。TRANSIC允許人類通過干預和在線校正來增強模擬政策,從而全面地克服各種未建模的模擬到現實差距。可以從人類的校正中學習殘差政策,並將其與模擬政策相結合以進行自主執行。我們展示了我們的方法可以在複雜且接觸豐富的操作任務(如家具組裝)中實現成功的模擬到現實轉移。通過在模擬中學習的政策和來自人類的政策的協同集成,TRANSIC作為一種全面解決各種常常共存的模擬到現實差距的方法是有效的。它展現出隨著人類努力而擴展的吸引人的特性。視頻和代碼可在https://transic-robot.github.io/ 上找到。
English
Learning in simulation and transferring the learned policy to the real world
has the potential to enable generalist robots. The key challenge of this
approach is to address simulation-to-reality (sim-to-real) gaps. Previous
methods often require domain-specific knowledge a priori. We argue that a
straightforward way to obtain such knowledge is by asking humans to observe and
assist robot policy execution in the real world. The robots can then learn from
humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven
approach to enable successful sim-to-real transfer based on a human-in-the-loop
framework. TRANSIC allows humans to augment simulation policies to overcome
various unmodeled sim-to-real gaps holistically through intervention and online
correction. Residual policies can be learned from human corrections and
integrated with simulation policies for autonomous execution. We show that our
approach can achieve successful sim-to-real transfer in complex and
contact-rich manipulation tasks such as furniture assembly. Through synergistic
integration of policies learned in simulation and from humans, TRANSIC is
effective as a holistic approach to addressing various, often coexisting
sim-to-real gaps. It displays attractive properties such as scaling with human
effort. Videos and code are available at https://transic-robot.github.io/Summary
AI-Generated Summary