RHYTHM：基於分層時序標記化的人類移動性推理

摘要

預測人類移動性本質上具有挑戰性，這源於複雜的長程依賴性和多尺度週期性行為。為此，我們提出了RHYTHM（基於分層時間標記化的人類移動性推理），這是一個統一框架，利用大型語言模型（LLMs）作為通用時空預測器和軌跡推理器。在方法論上，RHYTHM採用時間標記化技術，將每條軌跡劃分為每日片段，並將其編碼為帶有分層注意力的離散標記，從而捕捉每日和每週的依賴關係，顯著縮短序列長度同時保留週期性信息。此外，我們通過預先計算的提示嵌入來豐富標記表示，這些嵌入針對軌跡片段和預測目標，並通過凍結的LLM將這些組合嵌入反饋回LLM主幹，以捕捉複雜的相互依賴性。在計算層面，RHYTHM凍結了預訓練LLM的主幹，以降低注意力複雜度和內存成本。我們使用三個真實世界數據集對比評估了我們的模型與最先進的方法。值得注意的是，RHYTHM在整體準確率上提升了2.4%，在週末提高了5.0%，並減少了24.6%的訓練時間。代碼公開於https://github.com/he-h/rhythm。

English

Predicting human mobility is inherently challenging due to complex long-range dependencies and multi-scale periodic behaviors. To address this, we introduce RHYTHM (Reasoning with Hierarchical Temporal Tokenization for Human Mobility), a unified framework that leverages large language models (LLMs) as general-purpose spatio-temporal predictors and trajectory reasoners. Methodologically, RHYTHM employs temporal tokenization to partition each trajectory into daily segments and encode them as discrete tokens with hierarchical attention that captures both daily and weekly dependencies, thereby significantly reducing the sequence length while preserving cyclical information. Additionally, we enrich token representations by adding pre-computed prompt embeddings for trajectory segments and prediction targets via a frozen LLM, and feeding these combined embeddings back into the LLM backbone to capture complex interdependencies. Computationally, RHYTHM freezes the pretrained LLM's backbone to reduce attention complexity and memory cost. We evaluate our model against state-of-the-art methods using three real-world datasets. Notably, RHYTHM achieves a 2.4% improvement in overall accuracy, a 5.0% increase on weekends, and a 24.6% reduction in training time. Code is publicly available at https://github.com/he-h/rhythm.

RHYTHM：基於分層時序標記化的人類移動性推理

RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility

摘要

Support