EBES:事件序列的簡易基準測試
EBES: Easy Benchmarking for Event Sequences
October 4, 2024
作者: Dmitry Osin, Igor Udovichenko, Viktor Moskvoretskii, Egor Shvetsov, Evgeny Burnaev
cs.AI
摘要
事件序列以不規則採樣間隔和混合分類和數值特徵為特徵,在各種現實世界領域中普遍存在,如醫療保健、金融和用戶互動日誌。儘管時間數據建模技術取得了進展,但對事件序列性能的評估缺乏標準化基準。這使得由於評估協議的差異而難以比較不同論文的結果,可能誤導了該領域的進展。我們引入了EBES,一個全面的基準測試工具,具有標準化的評估場景和協議,重點關注具有序列級目標的迴歸和分類問題。我們的庫通過統一接口簡化了基準測試、數據集添加和方法集成。它包括一個新穎的合成數據集,並提供預處理的現實世界數據集,包括最大的公開銀行數據集。我們的結果對數據集進行了深入分析,識別了一些不適合進行模型比較的數據集。我們調查了建模時間和序列組件的重要性,以及模型的穩健性和擴展性特性。這些發現突顯了未來研究的潛在方向。我們的基準測試旨在促進可重現研究,加快進展並增加現實世界影響。
English
Event sequences, characterized by irregular sampling intervals and a mix of
categorical and numerical features, are common data structures in various
real-world domains such as healthcare, finance, and user interaction logs.
Despite advances in temporal data modeling techniques, there is no standardized
benchmarks for evaluating their performance on event sequences. This
complicates result comparison across different papers due to varying evaluation
protocols, potentially misleading progress in this field. We introduce EBES, a
comprehensive benchmarking tool with standardized evaluation scenarios and
protocols, focusing on regression and classification problems with
sequence-level targets. Our library simplifies benchmarking, dataset addition,
and method integration through a unified interface. It includes a novel
synthetic dataset and provides preprocessed real-world datasets, including the
largest publicly available banking dataset. Our results provide an in-depth
analysis of datasets, identifying some as unsuitable for model comparison. We
investigate the importance of modeling temporal and sequential components, as
well as the robustness and scaling properties of the models. These findings
highlight potential directions for future research. Our benchmark aim is to
facilitate reproducible research, expediting progress and increasing real-world
impacts.Summary
AI-Generated Summary