PATS:面向多视角运动技能评估的熟练度感知时序采样
PATS: Proficiency-Aware Temporal Sampling for Multi-View Sports Skill Assessment
June 5, 2025
作者: Edoardo Bianchi, Antonio Liotta
cs.AI
摘要
自動化運動技能評估需捕捉區分專家與新手表現的基本動作模式,然而現有的視頻採樣方法會破壞評估熟練度所需的時間連續性。為此,我們提出了一種新穎的採樣策略——熟練度感知時間採樣(Proficiency-Aware Temporal Sampling, PATS),該策略在連續時間段內保留完整的基本動作,以實現多視角技能評估。PATS自適應地分割視頻,確保每個分析部分包含關鍵表現組件的完整執行,並在多個片段中重複此過程,以在保持時間連貫性的同時最大化信息覆蓋範圍。在EgoExo4D基準上使用SkillFormer進行評估,PATS在所有視角配置下的準確率均超越了現有技術水平(+0.65%至+3.05%),並在具挑戰性的領域中取得了顯著提升(+26.22%攀岩,+2.39%音樂,+1.13%籃球)。系統分析表明,PATS成功適應了多樣化的活動特性——從動態運動的高頻採樣到序列技能的細粒度分割——展示了其作為一種適應性時間採樣方法在推進現實世界應用中自動化技能評估的有效性。
English
Automated sports skill assessment requires capturing fundamental movement
patterns that distinguish expert from novice performance, yet current video
sampling methods disrupt the temporal continuity essential for proficiency
evaluation. To this end, we introduce Proficiency-Aware Temporal Sampling
(PATS), a novel sampling strategy that preserves complete fundamental movements
within continuous temporal segments for multi-view skill assessment. PATS
adaptively segments videos to ensure each analyzed portion contains full
execution of critical performance components, repeating this process across
multiple segments to maximize information coverage while maintaining temporal
coherence. Evaluated on the EgoExo4D benchmark with SkillFormer, PATS surpasses
the state-of-the-art accuracy across all viewing configurations (+0.65% to
+3.05%) and delivers substantial gains in challenging domains (+26.22%
bouldering, +2.39% music, +1.13% basketball). Systematic analysis reveals that
PATS successfully adapts to diverse activity characteristics-from
high-frequency sampling for dynamic sports to fine-grained segmentation for
sequential skills-demonstrating its effectiveness as an adaptive approach to
temporal sampling that advances automated skill assessment for real-world
applications.