RoboTAP：用于少样本视觉模仿的任意点追踪

摘要

为了让机器人在实验室和专业工厂之外发挥作用，我们需要一种快速教导它们新的有用行为的方法。目前的方法要么缺乏足够的普适性来学习新任务而无需特定工程，要么缺乏数据效率，无法在合理时间内实现实际应用。在这项工作中，我们探讨了密集跟踪作为一种表征工具，以实现更快速、更普适的示范学习。我们的方法利用“跟踪任意点”（TAP）模型来分离示范中的相关运动，并对低层控制器进行参数化，以在场景配置变化时重现这种运动。我们展示了这将产生强大的机器人策略，可以解决复杂的物体排列任务，如形状匹配、堆叠，甚至全程跟随任务，如涂胶和粘合物体，所有这些都可以从几分钟内收集的示范中学习。

English

For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly. Current approaches lack either the generality to onboard new tasks without task-specific engineering, or else lack the data-efficiency to do so in an amount of time that enables practical use. In this work we explore dense tracking as a representational vehicle to allow faster and more general learning from demonstration. Our approach utilizes Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together, all from demonstrations that can be collected in minutes.

RoboTAP：用于少样本视觉模仿的任意点追踪

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

摘要

Support