UniT：ロボット学習のための統合触覚表現

要旨

UniTは、触覚表現学習における新しいアプローチであり、VQVAEを使用してコンパクトな潜在空間を学習し、触覚表現として機能します。単一の単純な物体から得られた触覚画像を使用して、転移性と汎用性を備えた表現を学習します。この触覚表現は、知覚タスクや操作ポリシー学習を含むさまざまな下流タスクにゼロショット転移することができます。手内3D姿勢推定タスクでのベンチマークでは、UniTが既存の視覚的および触覚的表現学習手法を上回ることを示しています。さらに、UniTのポリシー学習における有効性は、多様な操作対象物と複雑なロボット-物体-環境の相互作用を含む3つの実世界タスクで実証されています。広範な実験を通じて、UniTは学習が簡単でプラグアンドプレイでありながら、広く有効な触覚表現学習手法であることが示されています。詳細については、オープンソースリポジトリhttps://github.com/ZhengtongXu/UniTとプロジェクトウェブサイトhttps://zhengtongxu.github.io/unifiedtactile.github.io/をご参照ください。

English

UniT is a novel approach to tactile representation learning, using VQVAE to learn a compact latent space and serve as the tactile representation. It uses tactile images obtained from a single simple object to train the representation with transferability and generalizability. This tactile representation can be zero-shot transferred to various downstream tasks, including perception tasks and manipulation policy learning. Our benchmarking on an in-hand 3D pose estimation task shows that UniT outperforms existing visual and tactile representation learning methods. Additionally, UniT's effectiveness in policy learning is demonstrated across three real-world tasks involving diverse manipulated objects and complex robot-object-environment interactions. Through extensive experimentation, UniT is shown to be a simple-to-train, plug-and-play, yet widely effective method for tactile representation learning. For more details, please refer to our open-source repository https://github.com/ZhengtongXu/UniT and the project website https://zhengtongxu.github.io/unifiedtactile.github.io/.

UniT：ロボット学習のための統合触覚表現

UniT: Unified Tactile Representation for Robot Learning

要旨

Support