ChatPaper.aiChatPaper

事件序列的简易基准测试:EBES

EBES: Easy Benchmarking for Event Sequences

October 4, 2024
作者: Dmitry Osin, Igor Udovichenko, Viktor Moskvoretskii, Egor Shvetsov, Evgeny Burnaev
cs.AI

摘要

事件序列以不规则采样间隔和混合的分类和数值特征为特点,在医疗保健、金融和用户交互日志等各种现实领域中是常见的数据结构。尽管时间数据建模技术取得了进展,但在事件序列上评估其性能仍缺乏标准化基准。这导致不同论文之间结果比较复杂,因为评估协议不同,可能误导该领域的进展。我们引入了EBES,一个全面的基准测试工具,具有标准化的评估场景和协议,重点关注具有序列级目标的回归和分类问题。我们的库通过统一接口简化了基准测试、数据集添加和方法集成。它包括一个新颖的合成数据集,并提供预处理的现实世界数据集,包括最大的公开银行数据集。我们的结果对数据集进行了深入分析,确定了一些不适合模型比较的数据集。我们研究了建模时间和顺序组件的重要性,以及模型的鲁棒性和扩展性质。这些发现突出了未来研究的潜在方向。我们的基准测试旨在促进可重复研究,加快进展并增加实际影响。
English
Event sequences, characterized by irregular sampling intervals and a mix of categorical and numerical features, are common data structures in various real-world domains such as healthcare, finance, and user interaction logs. Despite advances in temporal data modeling techniques, there is no standardized benchmarks for evaluating their performance on event sequences. This complicates result comparison across different papers due to varying evaluation protocols, potentially misleading progress in this field. We introduce EBES, a comprehensive benchmarking tool with standardized evaluation scenarios and protocols, focusing on regression and classification problems with sequence-level targets. Our library simplifies benchmarking, dataset addition, and method integration through a unified interface. It includes a novel synthetic dataset and provides preprocessed real-world datasets, including the largest publicly available banking dataset. Our results provide an in-depth analysis of datasets, identifying some as unsuitable for model comparison. We investigate the importance of modeling temporal and sequential components, as well as the robustness and scaling properties of the models. These findings highlight potential directions for future research. Our benchmark aim is to facilitate reproducible research, expediting progress and increasing real-world impacts.

Summary

AI-Generated Summary

PDF62November 16, 2024