MOVE：運動引導的少樣本視頻對象分割

摘要

本研究聚焦于运动引导的少样本视频目标分割（FSVOS），旨在基于少量具有相同运动模式的标注示例，对视频中的动态目标进行分割。现有的FSVOS数据集与方法通常侧重于目标类别这一静态属性，忽视了视频中丰富的时序动态信息，限制了其在需要理解运动场景中的应用。为填补这一空白，我们引入了MOVE，一个专为运动引导FSVOS设计的大规模数据集。基于MOVE，我们在两种实验设置下全面评估了来自三个相关任务的六种最先进方法。结果表明，现有方法在处理运动引导FSVOS时面临挑战，促使我们深入分析相关难题，并提出了一种基线方法——解耦运动外观网络（DMA）。实验证明，我们的方法在少样本运动理解方面表现出色，为未来该方向的研究奠定了坚实基础。

English

This work addresses motion-guided few-shot video object segmentation (FSVOS), which aims to segment dynamic objects in videos based on a few annotated examples with the same motion patterns. Existing FSVOS datasets and methods typically focus on object categories, which are static attributes that ignore the rich temporal dynamics in videos, limiting their application in scenarios requiring motion understanding. To fill this gap, we introduce MOVE, a large-scale dataset specifically designed for motion-guided FSVOS. Based on MOVE, we comprehensively evaluate 6 state-of-the-art methods from 3 different related tasks across 2 experimental settings. Our results reveal that current methods struggle to address motion-guided FSVOS, prompting us to analyze the associated challenges and propose a baseline method, Decoupled Motion Appearance Network (DMA). Experiments demonstrate that our approach achieves superior performance in few shot motion understanding, establishing a solid foundation for future research in this direction.

MOVE：運動引導的少樣本視頻對象分割

MOVE: Motion-Guided Few-Shot Video Object Segmentation

摘要

Support