SimToolReal:一种面向零样本灵巧工具操作的以物体为中心策略
SimToolReal: An Object-Centric Policy for Zero-Shot Dexterous Tool Manipulation
February 18, 2026
作者: Kushal Kedia, Tyler Ga Wei Lum, Jeannette Bohg, C. Karen Liu
cs.AI
摘要
操纵工具的能力显著扩展了机器人可执行的任务范围。然而,工具操纵作为一类复杂的灵巧操作,需要掌握薄壁物体的抓取、手内物体旋转以及强力交互等技能。由于这些行为的遥操作数据采集具有挑战性,仿真到现实的强化学习(RL)成为一种有前景的替代方案。但现有方法通常需要大量工程投入来建模物体并为每个任务调整奖励函数。本研究提出SimToolReal方案,向通用化工具操纵的仿真到现实强化学习策略迈进一步。该方法不再聚焦于单一物体和任务,而是在仿真环境中程序化生成大量工具状物体基元,并训练单一强化学习策略以实现将每个物体操纵至随机目标位姿的通用目标。这一思路使SimToolReal在测试时无需任何物体或任务特定训练即可执行通用灵巧工具操纵。实验表明,SimToolReal以37%的优势超越先前的重定向方法和固定抓取方法,同时达到针对特定目标物体和任务训练的专用强化学习策略的性能水平。最后,我们证明SimToolReal可泛化至多样化的日常工具,在涵盖24项任务、12个物体实例和6种工具类别的120次真实世界测试中展现出强大的零样本性能。
English
The ability to manipulate tools significantly expands the set of tasks a robot can perform. Yet, tool manipulation represents a challenging class of dexterity, requiring grasping thin objects, in-hand object rotations, and forceful interactions. Since collecting teleoperation data for these behaviors is challenging, sim-to-real reinforcement learning (RL) is a promising alternative. However, prior approaches typically require substantial engineering effort to model objects and tune reward functions for each task. In this work, we propose SimToolReal, taking a step towards generalizing sim-to-real RL policies for tool manipulation. Instead of focusing on a single object and task, we procedurally generate a large variety of tool-like object primitives in simulation and train a single RL policy with the universal goal of manipulating each object to random goal poses. This approach enables SimToolReal to perform general dexterous tool manipulation at test-time without any object or task-specific training. We demonstrate that SimToolReal outperforms prior retargeting and fixed-grasp methods by 37% while matching the performance of specialist RL policies trained on specific target objects and tasks. Finally, we show that SimToolReal generalizes across a diverse set of everyday tools, achieving strong zero-shot performance over 120 real-world rollouts spanning 24 tasks, 12 object instances, and 6 tool categories.