ChatPaper.aiChatPaper

通过对比随机游走实现的自监督任意点跟踪

Self-Supervised Any-Point Tracking by Contrastive Random Walks

September 24, 2024
作者: Ayush Shrivastava, Andrew Owens
cs.AI

摘要

我们提出了一种简单的自监督方法来解决“跟踪任意点”(TAP)问题。我们训练一个全局匹配变压器,通过对比随机游走来找到视频中的循环一致轨迹,利用变压器基于注意力的全局匹配来定义空间-时间图上的随机游走的转移矩阵。能够执行“全对比”点之间的比较使模型能够获得高空间精度并获得强对比学习信号,同时避免了许多最近方法的复杂性(如粗到细的匹配)。为此,我们提出了一些设计决策,允许全局匹配架构通过自监督训练使用循环一致性。例如,我们发现基于变压器的方法对快捷解决方案很敏感,并提出了一个数据增强方案来解决这个问题。我们的方法在TapVid基准测试中取得了很好的性能,优于以前的自监督跟踪方法,如DIFT,并且与几种监督方法具有竞争力。
English
We present a simple, self-supervised approach to the Tracking Any Point (TAP) problem. We train a global matching transformer to find cycle consistent tracks through video via contrastive random walks, using the transformer's attention-based global matching to define the transition matrices for a random walk on a space-time graph. The ability to perform "all pairs" comparisons between points allows the model to obtain high spatial precision and to obtain a strong contrastive learning signal, while avoiding many of the complexities of recent approaches (such as coarse-to-fine matching). To do this, we propose a number of design decisions that allow global matching architectures to be trained through self-supervision using cycle consistency. For example, we identify that transformer-based methods are sensitive to shortcut solutions, and propose a data augmentation scheme to address them. Our method achieves strong performance on the TapVid benchmarks, outperforming previous self-supervised tracking methods, such as DIFT, and is competitive with several supervised methods.

Summary

AI-Generated Summary

PDF72November 16, 2024