SportsSloMo:人类中心视频帧插值的新基准和基线。
SportsSloMo: A New Benchmark and Baselines for Human-centric Video Frame Interpolation
August 31, 2023
作者: Jiaben Chen, Huaizu Jiang
cs.AI
摘要
以人为中心的视频帧插值在改善人们的娱乐体验和在体育分析行业中找到商业应用方面具有巨大潜力,例如合成慢动作视频。尽管社区中有多个基准数据集可用,但其中没有一个专门用于以人为中心的场景。为填补这一空白,我们引入了SportsSloMo,这是一个基准数据集,包括超过130K个视频剪辑和来自YouTube的高分辨率(≥720p)慢动作体育视频的100万个视频帧。我们在我们的基准数据集上重新训练了几种最先进的方法,结果显示它们的准确性较其他数据集有所下降。这突显了我们的基准数据集的困难,并表明即使对于表现最佳的方法,它也提出了重大挑战,因为人体高度可变形,体育视频中遮挡频繁。为提高准确性,我们引入了考虑人类感知先验的两个损失项,其中我们分别向全景分割和人体关键点检测添加辅助监督。这些损失项与模型无关,可以轻松地插入任何视频帧插值方法。实验结果验证了我们提出的损失项的有效性,导致超过5个现有模型的一致性性能改进,这些模型在我们的基准数据集上建立了强大的基线模型。数据集和代码可在以下网址找到:https://neu-vi.github.io/SportsSlomo/。
English
Human-centric video frame interpolation has great potential for improving
people's entertainment experiences and finding commercial applications in the
sports analysis industry, e.g., synthesizing slow-motion videos. Although there
are multiple benchmark datasets available in the community, none of them is
dedicated for human-centric scenarios. To bridge this gap, we introduce
SportsSloMo, a benchmark consisting of more than 130K video clips and 1M video
frames of high-resolution (geq720p) slow-motion sports videos crawled from
YouTube. We re-train several state-of-the-art methods on our benchmark, and the
results show a decrease in their accuracy compared to other datasets. It
highlights the difficulty of our benchmark and suggests that it poses
significant challenges even for the best-performing methods, as human bodies
are highly deformable and occlusions are frequent in sports videos. To improve
the accuracy, we introduce two loss terms considering the human-aware priors,
where we add auxiliary supervision to panoptic segmentation and human keypoints
detection, respectively. The loss terms are model agnostic and can be easily
plugged into any video frame interpolation approaches. Experimental results
validate the effectiveness of our proposed loss terms, leading to consistent
performance improvement over 5 existing models, which establish strong baseline
models on our benchmark. The dataset and code can be found at:
https://neu-vi.github.io/SportsSlomo/.