RoboTAP: 소수 샷 시각적 모방을 위한 임의 점 추적

초록

로봇이 실험실과 특수화된 공장 외부에서 유용하게 사용되기 위해서는 새로운 유용한 행동을 빠르게 가르칠 수 있는 방법이 필요합니다. 현재의 접근 방식들은 작업별 엔지니어링 없이 새로운 작업을 도입할 수 있는 일반성을 갖추지 못하거나, 실용적인 사용이 가능한 시간 내에 이를 수행할 수 있는 데이터 효율성이 부족합니다. 본 연구에서는 데모를 통해 더 빠르고 일반적인 학습을 가능하게 하는 표현적 수단으로서 밀집 추적(dense tracking)을 탐구합니다. 우리의 접근 방식은 Track-Any-Point (TAP) 모델을 활용하여 데모에서 관련된 움직임을 분리하고, 장면 구성의 변화에 걸쳐 이 움직임을 재현할 수 있는 저수준 제어기를 매개변수화합니다. 이를 통해 형상 맞추기, 쌓기, 심지어 접착제를 바르고 물체를 붙이는 것과 같은 전체 경로 추적 작업과 같은 복잡한 물체 배열 작업을 해결할 수 있는 강력한 로봇 정책을 얻을 수 있음을 보여줍니다. 이 모든 데모는 단 몇 분 내에 수집될 수 있습니다.

English

For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly. Current approaches lack either the generality to onboard new tasks without task-specific engineering, or else lack the data-efficiency to do so in an amount of time that enables practical use. In this work we explore dense tracking as a representational vehicle to allow faster and more general learning from demonstration. Our approach utilizes Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together, all from demonstrations that can be collected in minutes.

RoboTAP: 소수 샷 시각적 모방을 위한 임의 점 추적

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

초록

Support