物体堆叠操作的动态分辨率模型学习
Dynamic-Resolution Model Learning for Object Pile Manipulation
June 29, 2023
作者: Yixuan Wang, Yunzhu Li, Katherine Driggs-Campbell, Li Fei-Fei, Jiajun Wu
cs.AI
摘要
从视觉观察中学习的动力学模型已被证明在各种机器人操纵任务中非常有效。学习这些动力学模型的一个关键问题是使用何种场景表示。先前的研究通常假设采用固定维度或分辨率的表示,这对简单任务可能效率低,对更复杂的任务则效果不佳。在这项工作中,我们研究如何学习不同抽象级别的动态和自适应表示,以实现效率和有效性之间的最佳权衡。具体而言,我们构建了环境的动态分辨率粒子表示,并使用图神经网络(GNNs)学习统一的动力学模型,该模型允许连续选择抽象级别。在测试阶段,代理可以自适应地确定每个模型预测控制(MPC)步骤的最佳分辨率。我们在物体堆叠操纵中评估了我们的方法,这是我们在烹饪、农业、制造和制药应用中经常遇到的任务。通过在模拟和现实世界中进行全面评估,我们展示了我们的方法在收集、排序和重新分配由各种实例制成的颗粒状物体堆(如咖啡豆、杏仁、玉米等)方面比最先进的固定分辨率基线表现显著更好。
English
Dynamics models learned from visual observations have shown to be effective
in various robotic manipulation tasks. One of the key questions for learning
such dynamics models is what scene representation to use. Prior works typically
assume representation at a fixed dimension or resolution, which may be
inefficient for simple tasks and ineffective for more complicated tasks. In
this work, we investigate how to learn dynamic and adaptive representations at
different levels of abstraction to achieve the optimal trade-off between
efficiency and effectiveness. Specifically, we construct dynamic-resolution
particle representations of the environment and learn a unified dynamics model
using graph neural networks (GNNs) that allows continuous selection of the
abstraction level. During test time, the agent can adaptively determine the
optimal resolution at each model-predictive control (MPC) step. We evaluate our
method in object pile manipulation, a task we commonly encounter in cooking,
agriculture, manufacturing, and pharmaceutical applications. Through
comprehensive evaluations both in the simulation and the real world, we show
that our method achieves significantly better performance than state-of-the-art
fixed-resolution baselines at the gathering, sorting, and redistribution of
granular object piles made with various instances like coffee beans, almonds,
corn, etc.