TRANSIC：通過從在線校正中學習實現模擬到現實的策略轉移

摘要

在模擬環境中學習並將所學政策轉移到現實世界，有潛力實現通用型機器人。這種方法的關鍵挑戰在於解決模擬到現實（sim-to-real）之間的差距。先前的方法通常需要預先具備特定領域的知識。我們認為獲取這種知識的一種直接方式是請人類觀察並協助機器人在現實世界中執行政策。然後機器人可以從人類那裡學習，以消除各種模擬到現實之間的差距。我們提出了TRANSIC，這是一種基於人機協同框架的數據驅動方法，以實現成功的模擬到現實轉移。TRANSIC允許人類通過干預和在線校正來增強模擬政策，從而全面地克服各種未建模的模擬到現實差距。可以從人類的校正中學習殘差政策，並將其與模擬政策相結合以進行自主執行。我們展示了我們的方法可以在複雜且接觸豐富的操作任務（如家具組裝）中實現成功的模擬到現實轉移。通過在模擬中學習的政策和來自人類的政策的協同集成，TRANSIC作為一種全面解決各種常常共存的模擬到現實差距的方法是有效的。它展現出隨著人類努力而擴展的吸引人的特性。視頻和代碼可在https://transic-robot.github.io/ 上找到。

English

Learning in simulation and transferring the learned policy to the real world has the potential to enable generalist robots. The key challenge of this approach is to address simulation-to-reality (sim-to-real) gaps. Previous methods often require domain-specific knowledge a priori. We argue that a straightforward way to obtain such knowledge is by asking humans to observe and assist robot policy execution in the real world. The robots can then learn from humans to close various sim-to-real gaps. We propose TRANSIC, a data-driven approach to enable successful sim-to-real transfer based on a human-in-the-loop framework. TRANSIC allows humans to augment simulation policies to overcome various unmodeled sim-to-real gaps holistically through intervention and online correction. Residual policies can be learned from human corrections and integrated with simulation policies for autonomous execution. We show that our approach can achieve successful sim-to-real transfer in complex and contact-rich manipulation tasks such as furniture assembly. Through synergistic integration of policies learned in simulation and from humans, TRANSIC is effective as a holistic approach to addressing various, often coexisting sim-to-real gaps. It displays attractive properties such as scaling with human effort. Videos and code are available at https://transic-robot.github.io/

TRANSIC：通過從在線校正中學習實現模擬到現實的策略轉移

TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction

摘要

Support