ChatPaper.aiChatPaper

通過隨機玩具遊戲學習掌握萬物

Learning to Grasp Anything by Playing with Random Toys

October 14, 2025
作者: Dantong Niu, Yuvan Sharma, Baifeng Shi, Rachel Ding, Matteo Gioia, Haoru Xue, Henry Tsai, Konstantinos Kallidromitis, Anirudh Pai, Shankar Shastry, Trevor Darrell, Jitendra Malik, Roei Herzig
cs.AI

摘要

機器人操作策略往往難以泛化至新穎物體,這限制了其實際應用價值。相比之下,認知科學研究表明,兒童通過掌握一組簡單玩具,並將此知識應用於更複雜的物品,從而發展出可泛化的精細操作技能。受此啟發,我們探討機器人是否也能實現類似的泛化能力。研究結果表明,機器人能夠利用僅由四種基本形狀(球體、立方體、圓柱體和環形)隨機組裝而成的物體,學習到可泛化的抓取技能。我們證明,在這些“玩具”上進行訓練,能夠使機器人對現實世界中的物體實現穩健的泛化,展現出強大的零樣本性能。關鍵在於,我們發現這種泛化的核心在於由我們提出的檢測池化機制誘導出的以物體為中心的視覺表徵。在模擬和實體機器人上的評估中,我們的模型在YCB數據集上達到了67%的實際抓取成功率,超越了依賴大量域內數據的最先進方法。我們進一步研究了通過改變訓練玩具的數量與多樣性以及每個玩具的示範次數,零樣本泛化性能如何變化。我們相信,這項工作為機器人操作中的可擴展和可泛化學習提供了一條有前景的路徑。演示視頻、代碼、檢查點及我們的數據集可在項目頁面獲取:https://lego-grasp.github.io/。
English
Robotic manipulation policies often struggle to generalize to novel objects, limiting their real-world utility. In contrast, cognitive science suggests that children develop generalizable dexterous manipulation skills by mastering a small set of simple toys and then applying that knowledge to more complex items. Inspired by this, we study if similar generalization capabilities can also be achieved by robots. Our results indicate robots can learn generalizable grasping using randomly assembled objects that are composed from just four shape primitives: spheres, cuboids, cylinders, and rings. We show that training on these "toys" enables robust generalization to real-world objects, yielding strong zero-shot performance. Crucially, we find the key to this generalization is an object-centric visual representation induced by our proposed detection pooling mechanism. Evaluated in both simulation and on physical robots, our model achieves a 67% real-world grasping success rate on the YCB dataset, outperforming state-of-the-art approaches that rely on substantially more in-domain data. We further study how zero-shot generalization performance scales by varying the number and diversity of training toys and the demonstrations per toy. We believe this work offers a promising path to scalable and generalizable learning in robotic manipulation. Demonstration videos, code, checkpoints and our dataset are available on our project page: https://lego-grasp.github.io/ .
PDF42October 16, 2025