上架、堆叠、悬挂:多模态重新排列的关系姿势扩散
Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement
July 10, 2023
作者: Anthony Simeonov, Ankit Goyal, Lucas Manuelli, Lin Yen-Chen, Alina Sarmiento, Alberto Rodriguez, Pulkit Agrawal, Dieter Fox
cs.AI
摘要
我们提出了一个系统,用于重新排列场景中的物体,以实现所需的物体-场景放置关系,例如将一本书插入书架的开放槽中。该流程适用于新颖的几何形状、姿势以及场景和物体的布局,并且是通过示范训练直接在3D点云上运行的。我们的系统克服了与给定场景存在许多几何上相似的重新排列解决方案相关的挑战。通过利用迭代姿势去噪训练过程,我们可以拟合多模态示范数据并产生多模态输出,同时保持精确和准确。我们还展示了在忽略损害泛化和精度的无关全局结构的同时,通过对相关局部几何特征进行调节的优势。我们在模拟和真实世界中展示了我们的方法,涉及需要处理多模态和在物体形状和姿势上的泛化的三个不同重新排列任务。项目网站、代码和视频:https://anthonysimeonov.github.io/rpdiff-multi-modal/
English
We propose a system for rearranging objects in a scene to achieve a desired
object-scene placing relationship, such as a book inserted in an open slot of a
bookshelf. The pipeline generalizes to novel geometries, poses, and layouts of
both scenes and objects, and is trained from demonstrations to operate directly
on 3D point clouds. Our system overcomes challenges associated with the
existence of many geometrically-similar rearrangement solutions for a given
scene. By leveraging an iterative pose de-noising training procedure, we can
fit multi-modal demonstration data and produce multi-modal outputs while
remaining precise and accurate. We also show the advantages of conditioning on
relevant local geometric features while ignoring irrelevant global structure
that harms both generalization and precision. We demonstrate our approach on
three distinct rearrangement tasks that require handling multi-modality and
generalization over object shape and pose in both simulation and the real
world. Project website, code, and videos:
https://anthonysimeonov.github.io/rpdiff-multi-modal/