從學習旋轉「筆」中的教訓
Lessons from Learning to Spin "Pens"
July 26, 2024
作者: Jun Wang, Ying Yuan, Haichuan Che, Haozhi Qi, Yi Ma, Jitendra Malik, Xiaolong Wang
cs.AI
摘要
在手上操控類似筆的物件是我們日常生活中的重要技能,因為許多工具如錘子和螺絲刀都有類似的形狀。然而,目前基於學習的方法在這項任務上遇到困難,原因是缺乏高質量的示範和模擬與現實世界之間存在顯著差距。在這項研究中,我們通過展示旋轉類似筆的物件的能力,推動了基於學習的手上操控系統的界限。我們首先使用強化學習來訓練一個具有特權信息的預測策略,並在模擬中生成高保真的軌跡數據集。這有兩個目的:1)在模擬中預先訓練感覺運動策略;2)在現實世界中進行開環軌跡重放。然後,我們通過這些現實世界的軌跡對感覺運動策略進行微調,以使其適應現實世界的動態。僅需不到50條軌跡,我們的策略就學會了旋轉多於十個具有不同物理特性的類似筆的物件,並實現多次旋轉。我們對設計選擇進行了全面分析,並分享了開發過程中所學到的教訓。
English
In-hand manipulation of pen-like objects is an important skill in our daily
lives, as many tools such as hammers and screwdrivers are similarly shaped.
However, current learning-based methods struggle with this task due to a lack
of high-quality demonstrations and the significant gap between simulation and
the real world. In this work, we push the boundaries of learning-based in-hand
manipulation systems by demonstrating the capability to spin pen-like objects.
We first use reinforcement learning to train an oracle policy with privileged
information and generate a high-fidelity trajectory dataset in simulation. This
serves two purposes: 1) pre-training a sensorimotor policy in simulation; 2)
conducting open-loop trajectory replay in the real world. We then fine-tune the
sensorimotor policy using these real-world trajectories to adapt it to the real
world dynamics. With less than 50 trajectories, our policy learns to rotate
more than ten pen-like objects with different physical properties for multiple
revolutions. We present a comprehensive analysis of our design choices and
share the lessons learned during development.Summary
AI-Generated Summary