ChatPaper.aiChatPaper

上架、堆疊、掛起:多模態重新排列的關係姿勢擴散

Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement

July 10, 2023
作者: Anthony Simeonov, Ankit Goyal, Lucas Manuelli, Lin Yen-Chen, Alina Sarmiento, Alberto Rodriguez, Pulkit Agrawal, Dieter Fox
cs.AI

摘要

我們提出了一個系統,用於重新排列場景中的物體,以實現所需的物體-場景放置關係,例如將一本書插入書架的開放槽中。該流程通用於新穎的幾何形狀、姿勢以及場景和物體的佈局,並且是從示範中訓練的,直接在3D點雲上運作。我們的系統克服了與特定場景存在許多幾何相似的重新排列解決方案相關的挑戰。通過利用迭代姿勢去噪訓練程序,我們可以擬合多模態示範數據並產生多模態輸出,同時保持精確和準確。我們還展示了在忽略損害泛化和精確性的無關全局結構的情況下,條件化於相關的局部幾何特徵的優勢。我們在模擬和真實世界中展示了我們的方法,涉及需要處理多模態和對物體形狀和姿勢進行泛化的三個不同的重新排列任務。項目網站、代碼和視頻:https://anthonysimeonov.github.io/rpdiff-multi-modal/
English
We propose a system for rearranging objects in a scene to achieve a desired object-scene placing relationship, such as a book inserted in an open slot of a bookshelf. The pipeline generalizes to novel geometries, poses, and layouts of both scenes and objects, and is trained from demonstrations to operate directly on 3D point clouds. Our system overcomes challenges associated with the existence of many geometrically-similar rearrangement solutions for a given scene. By leveraging an iterative pose de-noising training procedure, we can fit multi-modal demonstration data and produce multi-modal outputs while remaining precise and accurate. We also show the advantages of conditioning on relevant local geometric features while ignoring irrelevant global structure that harms both generalization and precision. We demonstrate our approach on three distinct rearrangement tasks that require handling multi-modality and generalization over object shape and pose in both simulation and the real world. Project website, code, and videos: https://anthonysimeonov.github.io/rpdiff-multi-modal/
PDF40December 15, 2024