上架、堆疊、掛起:多模態重新排列的關係姿勢擴散
Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement
July 10, 2023
作者: Anthony Simeonov, Ankit Goyal, Lucas Manuelli, Lin Yen-Chen, Alina Sarmiento, Alberto Rodriguez, Pulkit Agrawal, Dieter Fox
cs.AI
摘要
我們提出了一個系統,用於重新排列場景中的物體,以實現所需的物體-場景放置關係,例如將一本書插入書架的開放槽中。該流程通用於新穎的幾何形狀、姿勢以及場景和物體的佈局,並且是從示範中訓練的,直接在3D點雲上運作。我們的系統克服了與特定場景存在許多幾何相似的重新排列解決方案相關的挑戰。通過利用迭代姿勢去噪訓練程序,我們可以擬合多模態示範數據並產生多模態輸出,同時保持精確和準確。我們還展示了在忽略損害泛化和精確性的無關全局結構的情況下,條件化於相關的局部幾何特徵的優勢。我們在模擬和真實世界中展示了我們的方法,涉及需要處理多模態和對物體形狀和姿勢進行泛化的三個不同的重新排列任務。項目網站、代碼和視頻:https://anthonysimeonov.github.io/rpdiff-multi-modal/
English
We propose a system for rearranging objects in a scene to achieve a desired
object-scene placing relationship, such as a book inserted in an open slot of a
bookshelf. The pipeline generalizes to novel geometries, poses, and layouts of
both scenes and objects, and is trained from demonstrations to operate directly
on 3D point clouds. Our system overcomes challenges associated with the
existence of many geometrically-similar rearrangement solutions for a given
scene. By leveraging an iterative pose de-noising training procedure, we can
fit multi-modal demonstration data and produce multi-modal outputs while
remaining precise and accurate. We also show the advantages of conditioning on
relevant local geometric features while ignoring irrelevant global structure
that harms both generalization and precision. We demonstrate our approach on
three distinct rearrangement tasks that require handling multi-modality and
generalization over object shape and pose in both simulation and the real
world. Project website, code, and videos:
https://anthonysimeonov.github.io/rpdiff-multi-modal/