ChatPaper.aiChatPaper

堵漏防泄:基于聚类的视频数据集防泄漏方法

Find the Leak, Fix the Split: Cluster-Based Method to Prevent Leakage in Video-Derived Datasets

November 17, 2025
作者: Noam Glazner, Noam Tsfaty, Sharon Shalev, Avishai Weizman
cs.AI

摘要

我们提出一种基于聚类的帧选择策略,以减轻视频衍生帧数据集中的信息泄露问题。该方法通过在划分训练集、验证集和测试集之前对视觉相似的帧进行分组,从而生成更具代表性、平衡性且可靠的数据集划分。
English
We propose a cluster-based frame selection strategy to mitigate information leakage in video-derived frames datasets. By grouping visually similar frames before splitting into training, validation, and test sets, the method produces more representative, balanced, and reliable dataset partitions.
PDF01December 2, 2025